Data Inconsistency Between Upstream MySQL and Downstream TiDB?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 上游mysql和下游tidb数据不一致?

| username: 孤君888

【TiDB Usage Environment】Production
【TiDB Version】v6.1.0
【Encountered Problem】

Background: I use the DM component to make TiDB a real-time replication slave, with MySQL 5.7 as the upstream.

Problem: Now, through sync_diff_inspector (data verification based on the DM synchronization scenario), I found data inconsistencies. However, I can’t see any issues upon my own inspection.

The sync_diff_inspector configuration is as follows:

check-thread-count = 4
export-fix-sql = true
check-struct-only = false
dm-addr = "http://x.x.x.x:8261"
dm-task = "task_mysql1322033307_incremental"

[task]
    output-dir = "./output"
    target-check-tables = ["xxx.*"]

The sync_diff_inspector run result summary.txt is as follows:

......
......
......
The following tables contain inconsistent data

+---------------------------------+--------------------+----------------+---------+-----------+
|              TABLE              | STRUCTURE EQUALITY | DATA DIFF ROWS | UPCOUNT | DOWNCOUNT |
+---------------------------------+--------------------+----------------+---------+-----------+
| `xxxxx`.`xxxx` | true               | +6913/-6913    |   57885 |     57885 |
+---------------------------------+--------------------+----------------+---------+-----------+

Time Cost: 19.60125373s

【Reproduction Path】What operations were performed to cause the problem
【Problem Phenomenon and Impact】

| username: Ming | Original post link

There are some limitations on data validation for JSON, FLOAT, and DOUBLE.

| username: Ming | Original post link

| username: 孤君888 | Original post link

I just checked the structure of my upstream MySQL table and didn’t find anything that would trigger the usage limitations of sync-diff-inspector.

The structure of the upstream MySQL table is as follows:

| username: 小龙虾爱大龙虾 | Original post link

The export-fix-sql option has been configured as true in the sync_diff configuration file, so the corresponding fix-sql should be output. You can check the fix-sql to locate the specific row’s primary key and manually check whether the data is consistent between the upstream and downstream.

| username: 孤君888 | Original post link

I manually checked it, and it looks consistent.

| username: xiaohetao | Original post link

Apart from the limitations of data verification, check if there is any latency. If there is latency, it can also cause data inconsistency.

| username: zhouzeru | Original post link

Usage limitations of sync-diff-inspector

| username: 孤君888 | Original post link

Currently, I found that the default character set rule for TiDB is utf8mb4_bin, while for upstream MySQL it is utf8mb4_general_ci. I think this might be the reason.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.