Issues with Full Data Migration Between Different Versions of TiDB Clusters

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 不同版本TiDB集群间的全量数据迁移问题

| username: EricSong

Since CentOS7 will stop being maintained next year, the old machines need to be replaced with RHEL8. Currently, only version v6.x supports RHEL8, while the existing cluster version is v4.x. Upgrading the cluster and changing the system is quite challenging, so we plan to directly build a new cluster on new machines and then migrate the data.

The ideal plan is to build a new v6.5 cluster on the new machines and then migrate the old data to the new cluster. However, the officially recommended solution does not support cross-version migration. Upgrading two major versions and then migrating also requires a lot of time and effort.
I would like to ask if there are any other cross-version full data migration solutions (the old cluster can be shut down for a period during migration, so full migration is sufficient).

[Existing Cluster Parameters]
Version: v4.0.11
Data Volume: Approximately 10T

| username: 啦啦啦啦啦 | Original post link

This solution is fine, and it does not say that cross-version migration is not supported. If BR is not compatible and the data volume is not particularly large, using dumpling with incremental backups is also feasible.

| username: EricSong | Original post link

Okay, so does that mean BR doesn’t care about version differences between the two clusters? However, I’m a bit confused because the screenshot above mentions compatibility issues in several versions of v6.x. If there are similar compatibility issues within the same major version, does that mean there are also compatibility issues between major versions? I haven’t seen any official explanation on this, so I’m a bit puzzled.

| username: 啦啦啦啦啦 | Original post link

Upgrading across such a large version may cause compatibility issues. How about considering using Dumpling for full data synchronization? I see that the data is 10TB, which is not too much. We are also planning to use this method with binlog synchronization to upgrade from version 3 to 6. When we were on version 3, there was no BR or TiCDC, so we could only use Dumpling.

| username: GreenGuan | Original post link

It is recommended to focus on testing whether there are adaptation issues, SQL statement compatibility, and performance regression after the business-driven change to the new version.

Based on the business scenarios you provided, here are a few solutions for your reference:

  1. Logical Import Method
    Source Database 4.x: Logical backup (dumpling) SQL file
    Target End 6.5: Logical import (lighting) SQL, reportedly 6.5 has about a 10-fold performance improvement

  2. Physical Import Method
    Source Database 4.x: Export using br tool
    Target End Same Version 4.x: Import using br tool
    Target End Same Version 4.x: Offline upgrade (–offline) to 5x, then offline upgrade to 6x

  3. Non-Stop Maintenance Method
    Use logical import and export method (avoiding br tool compatibility issues), then use ticdc for synchronization (I remember 4.0.11 has improvements for ticdc), and then gradually switch the business over.

| username: EricSong | Original post link

Thank you, but the solution of upgrading after physical import is not suitable for my situation. The target machine is an RHEL8 system, and the 4.x cluster does not support RHEL8. Therefore, I cannot migrate and then upgrade. I have to upgrade directly from v4.x to v6.x.

| username: dba-kit | Original post link

Solution 2 is amazing, I wonder why I didn’t think of it during my migration :joy:
I used Solution 3 during my migration, which was quite troublesome. The files backed up by Dumpling were based on statistical information and were actually not evenly distributed. When importing with TiDB-Lightning, there were a few tables that were clearly very small but had many small files, resulting in very slow import speeds and requiring manual handling. (When using the Dumpling tool to export, I also had to retry several times due to parameter issues.)

| username: dba-kit | Original post link

Has this been verified? Theoretically, it shouldn’t be.

| username: EricSong | Original post link

I haven’t done any testing on my end, but this is what the documentation says, so I used that as the basis.

| username: dba-kit | Original post link

I suggest trying to see if you can install it successfully first, because you are just using TiDB 4 as a transition. As long as you can import successfully, you can upgrade to TiDB 6 later.

PS: I used the logical export/import method for the migration, and there are still many details to pay attention to in the process, making it difficult to estimate the project duration.

| username: EricSong | Original post link

Thank you, I will also refer to this solution. I will first try to set up a 4.0 cluster on RHEL8 to see if it is feasible.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.