Configured multiple database names in mysql binlog-do-db, unable to achieve incremental synchronization using DM

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: mysql binlog-do-db配置了多个库名, 使用dm无法实现增量同步

| username: TiDBer_rYOSh9JN

Bug Report
Clearly and accurately describe the issue you found. Providing any steps to reproduce the problem can help the development team address the issue promptly.
[TiDB Version]
V6.5.0
[Impact of the Bug]
DM cannot perform incremental synchronization

[Possible Steps to Reproduce the Issue]

  1. Source MySQL database configuration: binlog-do-db=db1,db_to_sync
  2. DM only synchronizes the db_to_sync database
  3. DM can only perform full synchronization of the db_to_sync database, cannot perform incremental synchronization, and does not report an error
  4. Source MySQL database configuration: binlog-do-db=db_to_sync
  5. Incremental synchronization works at this point

[Observed Unexpected Behavior]
Cannot perform incremental synchronization, and no errors are reported in the logs

[Expected Behavior]
When mysql.ini is configured with binlog-do-db=db1,db_to_sync, and there are other databases before the one that needs to be synchronized, it should still be able to perform incremental synchronization

[Related Components and Specific Versions]
V6.5.0

[Other Background Information or Screenshots]
Such as cluster topology, system and kernel version, application app information, etc.; if the issue is related to SQL, please provide SQL statements and related table schema information; if there are critical errors in the node logs, please provide the relevant node log content or files; if some business-sensitive information is inconvenient to provide, please leave contact information, and we will communicate with you privately.

| username: 有猫万事足 | Original post link

How is your task.yaml written?

| username: xfworld | Original post link

Are the MySQL binlog configuration and account permissions sufficient?

| username: xingzhenxiang | Original post link

Post the configuration file and let’s take a look.

| username: TiDBer_rYOSh9JN | Original post link

vi source1.yaml
source-id: “mysql-207”
enable-gtid: false
from:
host: “10.0.0.207”
port: 3306
user: “root”
password: “xxx”

vi source2.yaml
source-id: “mysql-208”
enable-gtid: false
from:
host: “10.0.0.208”
port: 3306
user: “root”
password: “xxx”

vi task.yaml

# Merge MySQL tables from 2 separate databases (with the same database and table names) into TiDB
name: "mysql_merge_to_center"
task-mode: all
shard-mode: "pessimistic"
meta-schema: "dm_meta"
ignore-checking-items: ["auto_increment_ID"]

target-database:
  host: 10.x.x.x
  port: 4000
  user: "root"
  password: "xxxxx"

mysql-instances:
  -
    source-id: "mysql-207"
    block-allow-list:  "listA"
  -
    source-id: "mysql-208"
    block-allow-list:  "listA"

block-allow-list:
  listA:
    do-dbs: ["db_to_sync"]
    do-tables:
    - db-name: "db_to_sync"
      tbl-name: "table1"
| username: 有猫万事足 | Original post link

If you do not need to route the source table to a target TiDB table with a different name, you can skip this configuration. This rule must be configured for scenarios involving sharding and merging migrations.

You are in a sharding and merging migration scenario, but you have not configured the routing rules at all. This will inevitably lead to data from the two MySQL instances 207 and 208 overwriting each other, and you will need to modify your downstream tables; otherwise, there is a high probability of primary key conflicts between the two MySQL instances. With safe-mode enabled by default during import, the data will overwrite each other.

Therefore, you will certainly feel that sometimes it works and sometimes it doesn’t, which has nothing to do with your upstream binlog-do-db.

Synchronize the table1 in the db_to_sync database upstream to the same-named database and table downstream.

routes:                          
  route-rule-1:                  
    schema-pattern: "db_to_sync"      
    table-pattern: "table1"         
    target-schema: "db_to_sync"         
    target-table: "table1"             
  route-rule-2:
    schema-pattern: "db_to_sync"
    target-schema: "db_to_sync"

mysql-instances:
  - source-id: "mysql-207"                   
    block-allow-list:  "listA"                             
    route-rules: ["route-rule-1","route-rule-2"]
  - source-id: "mysql-208"                   
    block-allow-list:  "listA"                             
    route-rules: ["route-rule-1","route-rule-2"]

After configuring this, be sure to add a sharding key to the primary key of your downstream table. Also, pay close attention to the write hotspot issue in the downstream table.
Otherwise, even if the configuration is correct, primary key conflicts and data overwrites will still result in incorrect outcomes. Write hotspots will also slow down the import process and cause an unbalanced TiKV load.

| username: TiDBer_rYOSh9JN | Original post link

It’s not that it works sometimes and doesn’t work other times; it has never worked. The primary key of the source table has been handled to ensure there are no conflicts. The reasons for not doing routing are: 1. The primary key of the source table will not conflict (workid); 2. All database names and table names are completely consistent.

| username: 有猫万事足 | Original post link

Execute these two commands on the control machine, and let me see the results. If the DM_MASTER_ADDR environment variable is not set, you may need to add --master-addr dm-master-ip:8261.

| username: redgame | Original post link

Check the DM log files to see if there are any related errors or warning messages.