DDL Stuck in State Done for a Long Time

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: DDL 卡在 state done 状态很久

| username: 小老板努力变强

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.3
[Reproduction Path]
[Encountered Problem: Problem Phenomenon and Impact]
DDL stuck in done state for a long time, blocking other DDL executions
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]


Checking the tidb owner log, it was found that there was an issue synchronizing to other tidb nodes, and other tidb nodes were logging:
[2023/12/29 18:46:59.488 +08:00] [INFO] [session.go:4159] [“old running transaction block DDL”] [“table ID”=12673719] [jobID=12680983] [“connection ID”=5606557124618123315] [“elapsed time”=9h34m42.996504788s]

Tried restarting the owner node, but it didn’t help. Checking the processlist also didn’t reveal any long-running transactions.

I would like to ask how to obtain the corresponding information for the connectionid.
-------2023-12-29 19:19--------
Restarting all tidb nodes can restore.

| username: tidb狂热爱好者 | Original post link

This bug can be resolved by upgrading to version 7.5.

| username: 春风十里 | Original post link

Is there any specific information about the bug? Let me take a look.

| username: h5n1 | Original post link

Shut down all TiDB instances and then restart them.

| username: zhanggame1 | Original post link

The almighty reboot method

| username: dba远航 | Original post link

The original one hasn’t finished executing, and the new one has already started.

| username: wangccsy | Original post link

Why not just check what table this is?

| username: Jellybean | Original post link

Have you tried using admin cancel on this task ID? Can it be executed normally?

You can check these IDs in the system table to find the corresponding table and related task information.

| username: 随缘天空 | Original post link

Judging from the log information, it should be the old running transaction causing the table lock. Search online for how to check table lock information, then try killing the process.