* migration tool * custom field for primary key added * foreign key and multiple linking F_key issue resolved * refined code * many-to-one mapping temp fix * added support for pre-process + cleaned up code * [various] fixes to setup wizard for developer mode, frappe.enqueue_doc, share with assign * Refactor data migration module * added migration for hub * Add "Skip errors" in data import tool * move db_set to document.py * Add Data Migration Run * Dynamic Migration ID * move run() from Mapping to Run * Push Deleted Documents * fixes * [migration] doc operation counts * insert and update instead of push in connection * fix count and total_pages, skip sync if total_pages is 0 * [migration] child tables * fix complete() * [page] remove required libs * Add sidebar.js, rename old sidebar.js to form_sidebar.js * [minor] get_empty_state fixes * svg in icon * remove image check * fix codacy * fix is_child_table check * [connector] add get_list() * Add test for Data Migration Run * fix test * truncate tabNote * fix test * sync todo with event to fix test * fix db count * [mapping] export Mapping to json * Add docs for Data Migration Tool * [migration] pull data as list, test case * [hub] remove mapping export to files * Pull refactor * [test] * Add comments * [mapping] exec in mapping formula * fix codacy * fix codacy * Remove exec for pre-process and post-process * Add pre and post process for Push * Remove formula * fixes * [refactor] add failed_log to pull, handle error in pull * [test] Push, pull, update * Fix codacy, fix insert_doc for pull * Set migration id on successful insert * fix update_doc * fix update_doc * method is a function * child table mapping * Refactor logging * fix update_doc again * fix hostname, password * update docs, minors * Remove assign_if_none * Remove error handling from connection methods * [refactor] Data migration run * Break push stages into methods * Migration run refactor - fix test - add separate fields for logging * fix codacy * fix hostname password * fix test
6.1 KiB
Using the Data Migration Tool
Data Migration Tool was introduced in Frappé Framework version 9.
The Data Migration Tool was built to abstract all the syncing of data between a remote source and a DocType. This is a middleware layer between your Frappé based website and a remote data source.
To understand this tool, let's make a connector to push ERPNext Items to an imaginary service called Atlas.
Data Migration Plan
A Data Migration Plan encapsulates a set of mappings.
Let's make a new Data Migration Plan. Set the plan name as 'Atlas Sync'. We also need to add mappings in the mappings child table.
Data Migration Mapping
A Data Migration Mapping is a set of rules that specify field-to-field mapping.
Make a new Data Migration Mapping. Call it 'Item to Atlas Item'.
To define a mapping, we need to put in some values that define the structure of local and remote data.
- Remote Objectname: A name that identifies the remote object e.g Atlas Item
- Remote primary key: This is the name of the primary key for Atlas Item e.g id
- Local DocType: The DocType which will be used for syncing e.g Item
- Mapping Type: A Mapping can be of type 'Push' or 'Pull', depending on whether the data is to be mapped remotely or locally. It can also be 'Sync', which will perform both push and pull operations in a single cycle.
- Page Length: This defines the batch size of the sync.
Specifying field mappings:
The most basic form of a field mapping would be to specify fieldnames of the remote and local object. However, if the mapping is one-way (push or pull), the source field name can also take literal values in quotes (for e.g "GadgetTech") and eval statements (for e.g "eval:frappe.db.get_value('Company', 'Gadget Tech', 'default_currency')"). For example, in the case of a push mapping, the local fieldname can be set to a string in quotes or an eval expression, instead of a field name from the local doctype. (This is not possible with a sync mapping, where both local and remote fieldnames serve as a target destination at a some point, and thus cannot be a literal value).
Let's add the field mappings and save:
We can now add the 'Item to Atlas Item' mapping to our Data Migration Plan and save it.
Additional layer of control with pre and post process:
Migrating data frequently involves more steps in addition to one-to-one mapping. For a Data Migration Mapping that is added to a Plan, a mapping module is generated in the module specified in that plan.
In our case, an item_to_atlas_item module is created under the data_migration_mapping directory in Integrations (module for the 'Atlas Sync' plan).
You can implement the pre_process (receives the source doc) and post_process (receives both source and target docs, as well as any additional arguments) methods, to extend the mapping process. Here's what some operations could look like:
Data Migration Connector
Now, to connect to the remote source, we need to create a Data Migration Connector.
We only have two connector types right now, let's add another Connector Type in the Data Migration Connector DocType.
Now, let's create a new Data Migration Connector.
As you can see we chose the Connector Type as Atlas. We also added the hostname, username and password for our Atlas instance so that we can authenticate.
Now, we need to write the code for our connector so that we can actually push data.
Create a new file called atlas_connection.py in frappe/data_migration/doctype/data_migration_connector/connectors/ directory. Other connectors also live here.
We just have to implement the insert, update and delete methods for our atlas connector. We also need to write the code to connect to our Atlas instance in the __init__ method. Just see frappe_connection.py for reference.
After creating the Atlas Connector, we also need to import it into data_migration_connector.py
Data Migration Run
Now that we have our connector, the last thing to do is to create a new Data Migration Run.
A Data Migration Run takes a Data Migration Plan and Data Migration Connector and execute the plan according to our configuration. It takes care of queueing, batching, delta updates and more.
Just click Run. It will now push our Items to the remote Atlas instance and you can see the progress which updates in realtime.
After a run is executed successfully, you cannot run it again. You will have to create another run and execute it.
Data Migration Run will try to be as efficient as possible, so the next time you execute it, it will only push those items which were changed or failed in the last run.
Note: Data Migration Tool is still in beta. If you find any issues please report them here