Data Migration: Migrating from V4.0.0 to V7.1.1

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 数据迁移,V4.0.0迁移到V7.1.1

| username: 随缘天空

[TiDB Usage Environment] Production Environment / Testing / PoC
[Encountered Issues: Problem Phenomenon and Impact] Need to migrate all databases from a V4.0.0 cluster to a V7.1.1 cluster. What issues should be noted and what pre-checks should be performed? The two clusters are on different machines.
Current Cluster Situation:
V4.0.0: 3 TiDB nodes, 9 TiKV nodes, 3 PD nodes, 2 TiFlash nodes
V7.1.1: 2 TiDB nodes, 3 TiKV nodes, 3 PD nodes, no TiFlash nodes

| username: Billmay表妹 | Original post link

You can check this document: 从 TiDB 集群迁移数据至另一 TiDB 集群 | PingCAP 文档中心

| username: Fly-bird | Original post link

Supports upgrades and migrations.

| username: tidb菜鸟一只 | Original post link

It should only be possible to use dumpling+lightning and then synchronize with cdc or binlog, right?

| username: 随缘天空 | Original post link

You should use BR+TICDC. What I want to know is what needs to be paid attention to, because there are differences in the versions and the number of nodes between the two clusters now.

| username: 随缘天空 | Original post link

I have read it. What I want to know are the precautions, as there are differences in the version and number of nodes in the current cluster.

| username: 随缘天空 | Original post link

What I want to ask is what are the precautions for migration and the pre-migration inspection work.

| username: 啦啦啦啦啦 | Original post link

BR won’t work, the version gap is too large.

| username: 大飞哥online | Original post link

If you can afford downtime, just use the logical method for export and import.

| username: zhanggame1 | Original post link

First, check the data volume. If the data volume is not large, Dumpling will suffice and it’s quite fast.

| username: 随缘天空 | Original post link

Dumpling might not work, right? Because both the upstream and downstream databases are TiDB.

| username: 像风一样的男子 | Original post link

Dumpling supports both upstream and downstream being TiDB.

| username: zhanggame1 | Original post link

Import using Lightning

| username: tidb菜鸟一只 | Original post link

BR is not very reliable. Upgrading from 4.0 to 7.1 spans too many versions. You can check the compatibility of BR here:

| username: 普罗米修斯 | Original post link

Dumpling downloading full data:

  1. High memory configuration is required to prevent OOM during the download process.
  2. dump failed: Error 1105: statement count 5001 exceeds the transaction limitation, autocommit = true. This error occurs during the download process, adjust the stmt-count-limit length.

Lightning importing full data:

  1. High memory configuration is required to prevent OOM during the upload process.
  2. Error 1071: Specified key was too long; max key length is 3072 bytes. This error occurs during the upload process, add max-index-length = 3072*4 in the tidb-server configuration and reload the new cluster.

Binlog incremental synchronization:

  1. When the drainer component starts, it will traverse all DDL operations in the binlog history log in memory, consuming a large amount of memory. Testing shows that a 6.4T cluster requires 230G of memory for the drainer to start and it takes 1 hour to traverse in memory.
| username: 大飞哥online | Original post link

If you are upgrading across major versions, it is best to use logical backups. This approach is better.