Error When Exporting Data from 5.4 Dumpling to 7.5.1 Lightning

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 从5.4dumpling导出数据lightning到7.5.1报错

| username: 像风一样的男子

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version]
[Reproduction Path]
Exported full data from a 5.4.3 version cluster using dumpling, then encountered an error when importing into a 7.5.1 version cluster using tidb-lightning local mode: tidb lightning encountered error: [Lighting:Restore:ErrChecksumMismatch] checksum mismatched remote vs local => (checksum: 16570772018143892155 vs 2971348579763155136) (total_kvs: 28528 vs 28530) (total_bytes: 2751310 vs 2751536)
No issues when using tidb mode for logical import.
Is there a problem with this cross-version migration?

| username: hey-hoho | Original post link

The upstream and downstream data verification failed. Is the 7.5.1 cluster an empty table?

| username: MrSylar | Original post link

Have you verified the version of tidb-lightning?

| username: 林夕一指 | Original post link

Is it possible that there have been changes to the system tables? Did you ignore the system database when processing the full data?

| username: 像风一样的男子 | Original post link

Yes, it’s a new cluster.

| username: 像风一样的男子 | Original post link

Download the corresponding version of Lightning: https://download.pingcap.org/tidb-community-toolkit-v7.5.1-linux-amd64.tar.gz

| username: 像风一样的男子 | Original post link

Dumpling ignores system tables during export.

| username: yytest | Original post link

Ensure that the versions of Dumpling and TiDB Lightning you are using are compatible with the TiDB versions of both the source and target clusters. Generally, it is best to use the same version of Dumpling and TiDB Lightning as the target cluster.

Before importing, check whether the exported data files are complete and intact.

Try using the logical import mode (i.e., TiDB Lightning’s tidb mode) to import the data and see if it succeeds. Logical import goes through the TiDB layer, so it may be more compatible with differences between versions.

If possible, try importing the data in batches to determine which data is causing the checksum mismatch.

| username: 小龙虾爱大龙虾 | Original post link

Generally, cross-version migration is not the issue. First, ensure that the downstream table is empty and that no other sessions are making changes during the import; otherwise, a checksum error may occur. Another common issue is data conflicts. When conflict detection is set to none, Lightning relies on checksums to verify the accuracy of data imports. Data conflicts can arise from various reasons, such as different sql_modes, and need to be investigated specifically. For reference, see: 使用物理导入模式 | PingCAP 文档中心

| username: 像风一样的男子 | Original post link

I investigated and found that it is still a data duplication issue. I will open a new thread, please help me take a look.