The dump and lighting export/import process takes too long, causing ticdc to fail

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: dump和lighting导出导入时间太久,导致ticdc失败

| username: 扬仔_tidb

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Encountered Problem: Problem Phenomenon and Impact]
Due to the large version gap between the new and old clusters, we want to migrate some data without upgrading the old cluster.
Step 1: Used dumpling to export, which took 24 hours.
Step 2: Used lighting to import, which took 24 hours.
Today, when using ticdc and specifying start-ts as July 17th, it reported:

[CDC:ErrStartTsBeforeGC]fail to create changefeed because start-ts 442908153428574216 is earlier than GC safepoint at 442980806507102208

Dear experts, how to solve this in such a situation? Do we need to set the GC time of the TiDB cluster to be very long, like 3 days, before configuring ticdc?

| username: tidb菜鸟一只 | Original post link

For large data migrations, it is recommended to turn off GC first:
SET GLOBAL tidb_gc_enable=FALSE;

| username: 啦啦啦啦啦 | Original post link

Yes, it needs to be adjusted to a larger value, otherwise it will fail. If you are worried about insufficient disk space when increasing the GC, you can consider using binlog for synchronization.

| username: 像风一样的男子 | Original post link

If you don’t want to change the GC, you can try using binlog to synchronize data. You can set the data retention period if the binlog is stored locally.

| username: 孤君888 | Original post link

Increase the GC time. The default 10 minutes definitely won’t work.

| username: ljluestc | Original post link

  1. Adjust TiCDC synchronization start-ts: You can specify a timestamp greater than the current GC safepoint as the start-ts, ensuring it is not earlier than the GC safepoint. You can use the following command to get the current cluster’s GC safepoint:

Then, based on the VARIABLE_VALUE in the returned result, adjust TiCDC’s start-ts to a timestamp slightly later than this value.

  1. Adjust the GC safepoint of the TiDB cluster: If you no longer need to roll back to old data after migrating data, you can consider moving the GC safepoint of the TiDB cluster forward. You can set the GC safepoint to a newer timestamp by executing the following SQL statement:
SET GLOBAL tikv_gc_safe_point = '<new timestamp>';

Note that adjusting the GC safepoint may impact the cluster. Please ensure you fully understand its effects and perform backups and risk assessments before proceeding.

  1. Use the DM tool: If your data migration involves different versions of the TiDB cluster, you can also consider using the TiDB Data Migration (DM) tool. DM can help you migrate data between different versions and handle some compatibility issues during the migration process. Using the DM tool for data migration can be more convenient and flexible.
| username: redgame | Original post link

Increase GC time

| username: zhanggame1 | Original post link

How large is the data volume that it takes 24 hours? Can you look into how to reduce the export and import time?

| username: 有猫万事足 | Original post link

Indeed, first confirm whether it is a logical export and then a logical import.