Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: mysql binlog-do-db配置了多个库名, 使用dm无法实现增量同步
Bug Report
Clearly and accurately describe the issue you found. Providing any steps to reproduce the problem can help the development team address the issue promptly.
[TiDB Version]
V6.5.0
[Impact of the Bug]
DM cannot perform incremental synchronization
[Possible Steps to Reproduce the Issue]
- Source MySQL database configuration: binlog-do-db=db1,db_to_sync
- DM only synchronizes the db_to_sync database
- DM can only perform full synchronization of the db_to_sync database, cannot perform incremental synchronization, and does not report an error
- Source MySQL database configuration: binlog-do-db=db_to_sync
- Incremental synchronization works at this point
[Observed Unexpected Behavior]
Cannot perform incremental synchronization, and no errors are reported in the logs
[Expected Behavior]
When mysql.ini is configured with binlog-do-db=db1,db_to_sync, and there are other databases before the one that needs to be synchronized, it should still be able to perform incremental synchronization
[Related Components and Specific Versions]
V6.5.0
[Other Background Information or Screenshots]
Such as cluster topology, system and kernel version, application app information, etc.; if the issue is related to SQL, please provide SQL statements and related table schema information; if there are critical errors in the node logs, please provide the relevant node log content or files; if some business-sensitive information is inconvenient to provide, please leave contact information, and we will communicate with you privately.
How is your task.yaml written?
Are the MySQL binlog configuration and account permissions sufficient?
Post the configuration file and let’s take a look.
vi source1.yaml
source-id: “mysql-207”
enable-gtid: false
from:
host: “10.0.0.207”
port: 3306
user: “root”
password: “xxx”
vi source2.yaml
source-id: “mysql-208”
enable-gtid: false
from:
host: “10.0.0.208”
port: 3306
user: “root”
password: “xxx”
vi task.yaml
# Merge MySQL tables from 2 separate databases (with the same database and table names) into TiDB
name: "mysql_merge_to_center"
task-mode: all
shard-mode: "pessimistic"
meta-schema: "dm_meta"
ignore-checking-items: ["auto_increment_ID"]
target-database:
host: 10.x.x.x
port: 4000
user: "root"
password: "xxxxx"
mysql-instances:
-
source-id: "mysql-207"
block-allow-list: "listA"
-
source-id: "mysql-208"
block-allow-list: "listA"
block-allow-list:
listA:
do-dbs: ["db_to_sync"]
do-tables:
- db-name: "db_to_sync"
tbl-name: "table1"
If you do not need to route the source table to a target TiDB table with a different name, you can skip this configuration. This rule must be configured for scenarios involving sharding and merging migrations.
You are in a sharding and merging migration scenario, but you have not configured the routing rules at all. This will inevitably lead to data from the two MySQL instances 207 and 208 overwriting each other, and you will need to modify your downstream tables; otherwise, there is a high probability of primary key conflicts between the two MySQL instances. With safe-mode enabled by default during import, the data will overwrite each other.
Therefore, you will certainly feel that sometimes it works and sometimes it doesn’t, which has nothing to do with your upstream binlog-do-db.
Synchronize the table1 in the db_to_sync database upstream to the same-named database and table downstream.
routes:
route-rule-1:
schema-pattern: "db_to_sync"
table-pattern: "table1"
target-schema: "db_to_sync"
target-table: "table1"
route-rule-2:
schema-pattern: "db_to_sync"
target-schema: "db_to_sync"
mysql-instances:
- source-id: "mysql-207"
block-allow-list: "listA"
route-rules: ["route-rule-1","route-rule-2"]
- source-id: "mysql-208"
block-allow-list: "listA"
route-rules: ["route-rule-1","route-rule-2"]
After configuring this, be sure to add a sharding key to the primary key of your downstream table. Also, pay close attention to the write hotspot issue in the downstream table.
Otherwise, even if the configuration is correct, primary key conflicts and data overwrites will still result in incorrect outcomes. Write hotspots will also slow down the import process and cause an unbalanced TiKV load.
It’s not that it works sometimes and doesn’t work other times; it has never worked. The primary key of the source table has been handled to ensure there are no conflicts. The reasons for not doing routing are: 1. The primary key of the source table will not conflict (workid); 2. All database names and table names are completely consistent.
Execute these two commands on the control machine, and let me see the results. If the DM_MASTER_ADDR environment variable is not set, you may need to add --master-addr dm-master-ip:8261.
Check the DM log files to see if there are any related errors or warning messages.