Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: dm-worker在同步上游mysql的DDL时同步暂停,但query-status结果却正常

[TiDB Usage Environment] Production Environment
[TiDB Version]
Tidb version: 4.0.11
dm version: 2.0.1
[Encountered Problem: Problem Phenomenon and Impact]
The upstream MySQL executed an ALTER TABLE xxxx ENGINE = innodb
operation to reclaim table space using the pt tool.
The dm-worker hung while synchronizing data, but the query-status showed normal status, resulting in no alarm being triggered.
The dm-worker.log reported the following error:
[2023/12/06 10:35:14.295 +08:00] [ERROR] [subtask.go:310] [“unit process error”] [subtask=online_internal] [unit=Sync] [“error information”=“{"ErrCode":44006,"ErrClass":"schema-tracker","ErrScope":"internal","ErrLevel":"high","Message":"startLocation: [position: (mysql-bin.002974, 927069247), gtid-set: ], endLocation: [position: (mysql-bin.002974, 927069510), gtid-set: ]: cannot track DDL: ALTER TABLE xxxx_db
.xxxx_table
ENGINE = innodb","RawCause":"[ddl:8200]This type of ALTER TABLE is currently unsupported"}”]
query-status result
Subsequent Recovery:
The task was restarted using stop-task and start-task, and it automatically recovered after more than 5 minutes.
Recovery log as follows:
[2023/12/06 10:35:14.295 +08:00] [ERROR] [subtask.go:310] [“unit process error”] [subtask=online_internal] [unit=Sync] [“error information”=“{"ErrCode":44006,"ErrClass":"schema-tracker","ErrScope":"internal","ErrLevel":"high","Message":"startLocation: [position: (mysql-bin.002974, 927069247), gtid-set: ], endLocation: [position: (mysql-bin.002974, 927069510), gtid-set: ]: cannot track DDL: ALTER TABLE xxxx_db
.xxxx_table
ENGINE = innodb","RawCause":"[ddl:8200]This type of ALTER TABLE is currently unsupported"}”]
[2023/12/06 10:35:17.913 +08:00] [INFO] [worker.go:270] [“auto_resume sub task”] [component=“worker controller”] [task=online_internal]
[2023/12/06 10:35:17.914 +08:00] [INFO] [subtask.go:497] [“resume with unit”] [subtask=online_internal] [unit=Sync]
[2023/12/06 10:35:17.914 +08:00] [INFO] [task_checker.go:400] [“dispatch auto resume task”] [component=“task checker”] [task=online_internal]
[2023/12/06 10:35:18.326 +08:00] [INFO] [syncer.go:1132] [“replicate binlog from checkpoint”] [task=online_internal] [unit=“binlog replication”] [checkpoint="position: (mysql-bin.002974, 927069087), gtid-set: "]
[2023/12/06 10:35:18.328 +08:00] [INFO] [streamer_controller.go:71] [“last slave connection”] [task=online_internal] [unit=“binlog replication”] [“connection ID”=1671785]
[2023/12/06 10:35:18.328 +08:00] [INFO] [mode.go:100] [“change count”] [task=online_internal] [unit=“binlog replication”] [“previous count”=0] [“new count”=0]
[2023/12/06 10:35:18.328 +08:00] [INFO] [mode.go:100] [“change count”] [task=online_internal] [unit=“binlog replication”] [“previous count”=0] [“new count”=1]
During the synchronization interruption, the dm-worker_stdout.log log was as follows
The red box indicates the logs after recovery.
Question:
Is there any optimization or solution for this situation in subsequent versions?