Large Data Migration

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 大数据量迁移

| username: Hacker_piqLjvT2

“This area is for non-technical exchanges. TiDB 5.3.1 has 30TB of data to dump. What is a good way to migrate to TiDB 7.1 online without long business downtime? TiCDC can only handle incremental data, and dumping + recovery takes too long.”

| username: 像风一样的男子 | Original post link

Generally, such a large amount of data is exported using BR. BR is a hot backup and does not require downtime.

| username: tidb菜鸟一只 | Original post link

BR is not supported from 5.3 to 7.1, you can only use dumpling+ticdc. If the data volume is large with dumpling, you can check this out:

| username: zhanggame1 | Original post link

DM can do full replication, you can give it a try.

| username: Hacker_piqLjvT2 | Original post link

Thank you. The proposed solution is to install version 5.3 on both sides, migrate using BR+TiCDC, then upgrade the new TiDB to version 7.1, or use TiCDC with multiple synchronization tasks for each database, migrating one database at a time.

| username: redgame | Original post link

Your idea is feasible.

| username: 像风一样的男子 | Original post link

Is your final goal to upgrade the database or to migrate the database and upgrade to a higher version at the same time? If it’s within the same data center, you can add new nodes to the new server and then remove the old node servers to achieve migration. After that, you can also perform a database upgrade.

| username: Hacker_piqLjvT2 | Original post link

Setting up TiDB on a new server, migrating and upgrading data. The old server is on the government cloud, not in the same data center. The service cannot be down for an extended period.

| username: 像风一样的男子 | Original post link

If the network latency is high between different data centers, the CDC synchronization delay will also be significant.

| username: Hacker_piqLjvT2 | Original post link

TiCDC has a delay, so we can only temporarily stop the application on the primary database and wait for it to catch up. Is there a simpler way?