In DM v2.0, why does the full import task fail if DM restarts during the task?

system · January 30, 2023, 11:17am

In DM v2.0.1 and earlier versions, if DM restarts before the full import completes, the bindings between upstream data sources and DM-worker nodes might change. For example, it is possible that the intermediate data of the dump unit is on DM-worker node A but the load unit is run by DM-worker node B, thus causing the operation to fail.

The following are two solutions to this issue:

If the data volume is small (less than 1 TB) or the task merges sharded tables, take these steps:
1. Clean up the imported data in the downstream database.
2. Remove all files in the directory of exported data.
3. Delete the task using dmctl and run the command start-task --remove-meta to create a new task.
After the new task starts, it is recommended to ensure that there is no redundant DM worker node and avoid restarting or upgrading the DM cluster during the full import.
If the data volume is large (more than 1 TB), take these steps:
1. Clean up the imported data in the downstream database.
2. Deploy TiDB-Lightning to the DM worker nodes that process the data.
3. Use the Local-backend mode of TiDB-Lightning to import data that DM dump units export.
4. After the full import completes, edit the task configuration file in the following ways and restart the task:
  - Change task-mode to incremental.
  - Set the value of mysql-instance.meta.pos to the position recorded in the metadata file that the dump unit outputs.