Are there any good mature solutions or one-click migration tools for migrating 40T of data to a new cluster in TiDB v3.1.0?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb v3.1.0 40T数据迁移到新集群,有啥好的成熟方案,或者一键迁移工具吗

| username: xingzhenxiang

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] Want to migrate data from v3.1.0 to a new cluster
[Encountered Problem: Problem Phenomenon and Impact]
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]

Is there any good mature solution or one-click migration tool for migrating 40T data from TiDB v3.1.0 to a new cluster?
image

| username: Billmay表妹 | Original post link

This version is a bit outdated, you might need to upgrade to a newer version of the cluster!

Which version of the cluster do you want to migrate to?

| username: xingzhenxiang | Original post link

Old Cluster
Currently required to remain as is

New Cluster
Primary goal: v7.1.x
Ultimate goal: v7.5.x (MySQL 8.0 compatibility)

Any good suggestions?

| username: Billmay表妹 | Original post link

When should we migrate?
You can first consider 7.1.2 or 6.5.5.

| username: xingzhenxiang | Original post link

Currently collecting feasibility plans, are there any DTS (Data Transformation Service) tools available?

| username: Billmay表妹 | Original post link

There were some discussions before.

| username: xingzhenxiang | Original post link

The main issue is how to quickly export this 40TB of data. Currently, binlog is not enabled, and even if binlog is enabled, the base data still needs to be processed.

| username: 像风一样的男子 | Original post link

To test the export, this won’t be fast. Each export is calculated on a daily basis.

| username: xingzhenxiang | Original post link

May I ask which file systems are supported by this network disk besides S3?

| username: 像风一样的男子 | Original post link

I used an NFS shared disk for mounting, and the installation was relatively simple.

| username: xingzhenxiang | Original post link

How is the data aggregated after the backup is completed?

| username: 像风一样的男子 | Original post link

You just need to mount the shared disk to all the KV nodes. BR backup will automatically aggregate.

| username: 像风一样的男子 | Original post link

Take a closer look at the BR backup documentation:

| username: Fly-bird | Original post link

Use the BR tool

| username: tidb菜鸟一只 | Original post link

Even if you use BR, it only synchronizes the full data to the new cluster. For the final migration, you still need to use binlog. Version 3.1 doesn’t even support TiCDC.

| username: xingzhenxiang | Original post link

I’ll look into it. It seems that br3.1.0 doesn’t have that much content.

| username: xingzhenxiang | Original post link

Currently researching this method.

| username: xingzhenxiang | Original post link

Currently researching the storage of incremental data in queues.

| username: Kongdom | Original post link

Deploy a new cluster with v3.1.0, then perform a backup and restore. After the restoration, upgrade to v7.1.x.

| username: xingzhenxiang | Original post link

Can the logical backup be directly applied to v7.1.x?