Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 6.5版本在DM同步DDL时候被卡死,没办法取消

DM synchronized a small table’s DDL to the TiDB server, and the TiDB server was stuck. Using ADMIN CANCEL DDL JOBS
to cancel the DDL also kept failing. After nearly four hours, it finally succeeded in canceling. During this period, there were many [2023/05/24 19:09:28.673 +08:00] [INFO] [syncer.go:333] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip {tidb-server-ip} port 4000, id 831b35bd-9d31-4ec4-a005-08ac7c6ead64"] ["ddl id"=47401] [ver=33590]
errors.
Filtered logs are as follows:
[2023/05/24 19:09:28.569 +08:00] [INFO] [ddl_worker.go:314] ["[ddl] add DDL jobs"] ["batch count"=1] [jobs="ID:47401, Type:add column, State:queueing, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:4, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0; "] [table=true]
[2023/05/24 19:09:28.569 +08:00] [INFO] [ddl.go:1011] ["[ddl] start DDL job"] [job="ID:47401, Type:add column, State:queueing, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:4, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [query="ALTER TABLE `db1`.`tb1` ADD COLUMN `col1` VARCHAR(30) NULL COMMENT '收益止盈类型, FIXED:固定值, RANGE:范围值'"]
[2023/05/24 19:09:28.589 +08:00] [INFO] [ddl_worker.go:1168] ["[ddl] run DDL job"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:queueing, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:0, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/05/24 19:09:28.592 +08:00] [INFO] [column.go:133] ["[ddl] run add column job"] [job="ID:47401, Type:add column, State:running, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:4, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"] [columnInfo="{\"id\":45,\"name\":{\"O\":\"col1\",\"L\":\"col1\"},\"offset\":44,\"origin_default\":null,\"origin_default_bit\":null,\"default\":null,\"default_bit\":null,\"default_is_expr\":false,\"generated_expr_string\":\"\",\"generated_stored\":false,\"dependences\":null,\"type\":{},\"state\":0,\"comment\":\"收益止盈类型, FIXED:固定值, RANGE:范围值\",\"hidden\":false,\"change_state_info\":null,\"version\":2}"]
[2023/05/24 19:09:28.634 +08:00] [INFO] [domain.go:726] ["mdl gets lock, update to owner"] [jobID=47401] [version=33590]
[2023/05/24 19:09:28.673 +08:00] [INFO] [syncer.go:333] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.1.221, port 4000, id 65867468-6375-4732-b2e7-1501df354479"] ["ddl id"=47401] [ver=33590]
[2023/05/24 19:09:28.673 +08:00] [INFO] [syncer.go:333] ["[ddl] syncer check all versions, someone is not synced"] [info="instance ip 192.168.1.245, port 4000, id 831b35bd-9d31-4ec4-a005-08ac7c6ead64"] ["ddl id"=47401] [ver=33590]
# Many logs filtered out, mostly `syncer check all versions, someone is not synced` errors
[2023/05/25 01:14:22.975 +08:00] [INFO] [ddl_worker.go:1377] ["[ddl] wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [ver=33590] ["take time"=6h4m54.355850116s] [job="ID:47401, Type:add column, State:running, SchemaState:delete only, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:4, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/05/25 01:14:22.989 +08:00] [INFO] [ddl_worker.go:1168] ["[ddl] run DDL job"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:cancelling, SchemaState:delete only, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:0, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0"]
[2023/05/25 01:14:22.992 +08:00] [INFO] [rollingback.go:433] ["[ddl] the DDL job is cancelled normally"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:rollingback, SchemaState:delete only, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:1, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"] [error="[ddl:8214]Cancelled DDL job"]
[2023/05/25 01:14:23.004 +08:00] [INFO] [ddl_worker.go:1168] ["[ddl] run DDL job"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:rollingback, SchemaState:delete only, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:0, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"]
[2023/05/25 01:14:23.084 +08:00] [INFO] [domain.go:726] ["mdl gets lock, update to owner"] [jobID=47401] [version=33626]
[2023/05/25 01:15:21.269 +08:00] [INFO] [ddl_worker.go:1377] ["[ddl] wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [ver=33626] ["take time"=58.239228943s] [job="ID:47401, Type:add column, State:rollingback, SchemaState:delete reorganization, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:1, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"]
[2023/05/25 01:15:21.280 +08:00] [INFO] [ddl_worker.go:1168] ["[ddl] run DDL job"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:rollingback, SchemaState:delete reorganization, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:0, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"]
[2023/05/25 01:15:21.334 +08:00] [INFO] [domain.go:726] ["mdl gets lock, update to owner"] [jobID=47401] [version=33627]
[2023/05/25 01:15:21.379 +08:00] [INFO] [ddl_worker.go:1377] ["[ddl] wait latest schema version changed(get the metadata lock if tidb_enable_metadata_lock is true)"] [ver=33627] ["take time"=74.139844ms] [job="ID:47401, Type:add column, State:rollback done, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:1, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"]
[2023/05/25 01:15:21.393 +08:00] [INFO] [ddl_worker.go:643] ["[ddl] finish DDL job"] [worker="worker 3, tp general"] [job="ID:47401, Type:add column, State:rollback done, SchemaState:none, SchemaID:5747, TableID:7466, RowCount:0, ArgLen:0, start time: 2023-05-24 19:09:28.503 +0800 CST, Err:[ddl:8214]Cancelled DDL job, ErrCount:1, SnapshotVersion:0"]
[2023/05/25 01:15:21.440 +08:00] [INFO] [ddl.go:1114] ["[ddl] DDL job is failed"] [jobID=47401]