Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: dump和lighting导出导入时间太久,导致ticdc失败
[TiDB Usage Environment] Production Environment
[TiDB Version]
[Encountered Problem: Problem Phenomenon and Impact]
Due to the large version gap between the new and old clusters, we want to migrate some data without upgrading the old cluster.
Step 1: Used dumpling to export, which took 24 hours.
Step 2: Used lighting to import, which took 24 hours.
Today, when using ticdc and specifying start-ts as July 17th, it reported:
[CDC:ErrStartTsBeforeGC]fail to create changefeed because start-ts 442908153428574216 is earlier than GC safepoint at 442980806507102208
Dear experts, how to solve this in such a situation? Do we need to set the GC time of the TiDB cluster to be very long, like 3 days, before configuring ticdc?
For large data migrations, it is recommended to turn off GC first:
SET GLOBAL tidb_gc_enable=FALSE;
Yes, it needs to be adjusted to a larger value, otherwise it will fail. If you are worried about insufficient disk space when increasing the GC, you can consider using binlog for synchronization.
If you don’t want to change the GC, you can try using binlog to synchronize data. You can set the data retention period if the binlog is stored locally.
Increase the GC time. The default 10 minutes definitely won’t work.
- Adjust TiCDC synchronization start-ts: You can specify a timestamp greater than the current GC safepoint as the start-ts, ensuring it is not earlier than the GC safepoint. You can use the following command to get the current cluster’s GC safepoint:
SELECT VARIABLE_NAME, VARIABLE_VALUE FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_safe_point';
Then, based on the VARIABLE_VALUE in the returned result, adjust TiCDC’s start-ts to a timestamp slightly later than this value.
- Adjust the GC safepoint of the TiDB cluster: If you no longer need to roll back to old data after migrating data, you can consider moving the GC safepoint of the TiDB cluster forward. You can set the GC safepoint to a newer timestamp by executing the following SQL statement:
SET GLOBAL tikv_gc_safe_point = '<new timestamp>';
Note that adjusting the GC safepoint may impact the cluster. Please ensure you fully understand its effects and perform backups and risk assessments before proceeding.
- Use the DM tool: If your data migration involves different versions of the TiDB cluster, you can also consider using the TiDB Data Migration (DM) tool. DM can help you migrate data between different versions and handle some compatibility issues during the migration process. Using the DM tool for data migration can be more convenient and flexible.
How large is the data volume that it takes 24 hours? Can you look into how to reduce the export and import time?
Indeed, first confirm whether it is a logical export and then a logical import.