I found that TIDB DM cannot detect master-slave data inconsistency

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 我发现TIDB DM无法检测主从数据不一致

| username: Hacker_7b2KWuuo

I have a table t1 with 5 rows of data on my MySQL primary database. I intentionally deleted these 5 rows on TiDB, then went back to the upstream MySQL and executed an update on the entire table. When I returned to TiDB and executed query-status, there was no error reported. If it were MySQL, there should be a 1032 error, and synchronization would stop. However, TiDB DM did not alert the DBA to any errors, making it impossible to detect data inconsistency.


| username: TiDBer_小阿飞 | Original post link

Sure, please provide the Chinese text you need translated.

| username: Hacker_7b2KWuuo | Original post link

When will this BUG be fixed? If data inconsistency cannot be detected, TiDB cannot be used as a replica for business purposes, especially in financial services. A single data error could involve tens of thousands of dollars, and if the product calculates the data incorrectly, it would directly result in a P0 incident for the DBA.

| username: 有猫万事足 | Original post link

Check if this helps you, I haven’t tried it, so I’m not very sure.

| username: onlyacat | Original post link

  1. The downstream TiDB target tables should not be modified; they are read-only.
  2. This issue is not related to DM. DM is only responsible for parsing the upstream binlog and replaying it downstream. For example, if you insert instead of delete downstream, should it report an error?
  3. If you need to detect data inconsistencies, TiDB provides a tool called sync-diff-inspector [sync-diff-inspector User Guide | PingCAP Docs].
| username: Hacker_7b2KWuuo | Original post link

Thanks, I’ll test it this afternoon. That’s exactly what I meant.

| username: 有猫万事足 | Original post link

I agree with @onlyacat.
Detecting and then fixing the issue is a rather passive way to address this problem.
It might be feasible, but the cost would definitely be very high. The resources required to deal with occasional unexpected writes would be several times greater than the problem itself.

The best approach is still to revoke write/DDL permissions on the sync table from all users except dmuser.
Preventing unexpected writes/structural changes to the downstream table is the best solution.

| username: Hacker_7b2KWuuo | Original post link

I have already tested it, and it doesn’t work. The mechanism is different from MySQL, and it cannot report error 1032.

| username: Hacker_7b2KWuuo | Original post link

This is a feature that needs to have the exact same mechanism as MySQL. If there is data inconsistency between the master and slave, it must trigger an alert to inform the DBA. You can look into MySQL error 1032. I hope the official team can resolve this bug as soon as possible.

| username: Hacker_7b2KWuuo | Original post link

No, you should take a look at the MySQL master-slave synchronization mechanism. DM should be similar. No matter what method is used, if the data is inconsistent, an alert must be triggered. This is the most basic function. If you can’t trigger an alert, the subsequent data will be inconsistent, and the resulting failure will be a P0 issue!!!

| username: TiDBer_5cwU0ltE | Original post link

I feel like this is just the design philosophy of the DM tool.

| username: onlyacat | Original post link

TiDB is not MySQL, and DM is just a peripheral tool. This is not a basic function. You should first understand the principles of MySQL master-slave replication. If you have this requirement, you can directly provide feedback to the official team, although I feel they probably won’t implement it.

| username: 这里介绍不了我 | Original post link

Try submitting the request to the official team.

| username: Hacker_7b2KWuuo | Original post link

First, you should understand MySQL error 1032.

| username: Hacker_7b2KWuuo | Original post link

Do you have an email? I want to send them a request.

| username: Hacker_7b2KWuuo | Original post link

My requirement is: to have the same synchronization replication as MySQL. If errors 1062 and 1032 occur, the synchronization should be interrupted, and DM should provide an error message. The DBA can monitor and resolve it through a script. If the official team can see this post, I hope this feature can be added.

| username: 有猫万事足 | Original post link

You can submit your request here.

| username: CuteRay | Original post link

How should I put it, is this considered a bug? I don’t think so. The documentation clearly defines DM as a convenient data migration tool. Can it do data synchronization? Yes, it can, but it is primarily for migrating MySQL to TiDB. In other words, eventually, you will switch to TiDB. If before the switch, you insist on using TiDB as the downstream, and both MySQL and TiDB are open for business to do dual writes, then I think you shouldn’t use DM, or rather, your architectural choice was wrong from the beginning. The downstream backup is just a backup, and at the moment it serves as a backup, it should be read-only. If one day you write to the backup, delete data, and then blame the tool for having flaws? That’s interesting.

| username: 路在何chu | Original post link

I think so too. It’s just for temporary synchronization during migration. It’s not meant for long-term synchronization. If you’re not going to use TiDB, then there’s no need to migrate. If you plan to use TiDB, you should consider switching as soon as the migration is complete. It can only ensure data consistency during the migration process, but it cannot guarantee consistency during the synchronization period.

| username: Hacker_7b2KWuuo | Original post link

I am using TiDB as a replica to provide complex SQL queries for business purposes, without writing data. I am just manually testing whether DM’s replication mechanism is the same as MySQL’s. When MySQL encounters an error during synchronous replication, it will automatically stop the sql_thread thread. Executing show slave status will show the specific error information, and after the DBA fixes the data, the DBA will execute start slave to resume synchronization. I hope DM can have the same replication mechanism as MySQL.