TiDB DDL execution has been running for two days and cannot be deleted

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB DDL执行两天了,无法删除

| username: xmlianfeng

TiDB Version: 6.5.3
Operation Path:

  1. Create a new table
  2. Drop table
    Issue:
    Found that this empty table has been executing for 2 days without completion.
    In admin show ddl jobs, it is always in the running state.

Tried using admin cancel ddl jobs to stop the DDL, and the following prompt appeared:

Using TABLE_ID to query in INFORMATION_SCHEMA.tables,
tidb_table_id=‘621893’ has no record.

Is there any way to forcibly delete this DDL?

| username: h5n1 | Original post link

First, take a look at what’s inside mysql.tidb_mdl_view.

| username: xmlianfeng | Original post link

There is no record information in this table. I am wondering if there was an issue during deletion that caused the ID records in the tables to be updated or deleted, leading to the DDL being stuck.
The logs show:

[2023/08/18 14:59:37.666 +08:00] [INFO] [syncer.go:333] [“[ddl] syncer check all versions, someone is not synced”] [info=“instance ip 10.10.10.86, port 4000, id c6ea2820-553e-4ba8-b3bb-693bd3695390”] [“ddl id”=622065] [ver=500071]
[2023/08/18 14:59:37.666 +08:00] [INFO] [syncer.go:333] [“[ddl] syncer check all versions, someone is not synced”] [info=“instance ip 10.10.10.247, port 4000, id e0f1f71b-20dc-44c3-93e8-92858dab03ac”] [“ddl id”=622065] [ver=500071]
[2023/08/18 14:59:37.666 +08:00] [INFO] [syncer.go:333] [“[ddl] syncer check all versions, someone is not synced”] [info=“instance ip 10.10.10.74, port 4000, id e8790a44-c60f-4bce-aaae-c9934bb82b17”] [“ddl id”=622065] [ver=500071]

| username: h5n1 | Original post link

Official troubleshooting:

Identifying the issue of DDL execution being stuck

  1. First, rule out the common reasons for DDL statements executing slowly.
  2. Use any of the following methods to find the DDL owner node:
  • Use curl http://{TiDBIP}:10080/info/all to get the current cluster’s Owner.

  • Check the DDL > DDL META OPM monitoring to see the Owner for a specific time period.

  • If the Owner does not exist, try manually triggering an Owner election: curl -X POST http://{TiDBIP}:10080/ddl/owner/resign.

I estimate that it might be necessary to restart all TiDB servers.

| username: tidb菜鸟一只 | Original post link

You need to restart the tidb-server.

| username: xmlianfeng | Original post link

All TiDB instances have been restarted… but it’s still stuck.

I don’t think it’s because of slowness. This table is empty. Could it be that the table_id cannot be found?

| username: h5n1 | Original post link

Is the TiDB server restarted one by one or all stopped and then restarted?

| username: xmlianfeng | Original post link

Tried both methods, also tried shutting everything down and starting a single tidb-server first. No effect.

| username: 有猫万事足 | Original post link

There is a similar issue on GitHub, and the last response points to the metadata lock.

Try following the method in the documentation to handle DDL blocking. It might be that the session you killed earlier is not the one causing the current blockage.

If it still doesn’t work, you might as well disable the metadata lock feature.

“In versions v6.5.0 and later, TiDB enables the metadata lock feature by default. When the cluster is upgraded from a version prior to v6.5.0 to v6.5.0 or later, TiDB will automatically enable the metadata lock feature. If you need to disable the metadata lock, you can set the system variable tidb_enable_metadata_lock to OFF.”

| username: redgame | Original post link

There’s no good solution, you can only shut down the tidb-server.

| username: xmlianfeng | Original post link

Shutting down the TIDB-server, I have tried that too… It doesn’t seem to work for my situation.

| username: songxuecheng | Original post link

SELECT * FROM mysql.tidb_mdl_view  Check if there are any records
| username: zhanggame1 | Original post link

Is the DDL still stuck?

| username: kkpeter | Original post link

How did you finally recover? We are experiencing the same issue with version 6.5.1.

| username: zhanggame1 | Original post link

Just restart all TiDB components:

tiup cluster restart tidb-test -R tidb

I also encountered this issue.

| username: Billmay表妹 | Original post link

Was it resolved later?

| username: wjhuang2016-PINGCAP | Original post link

You can check:

  1. select * from mysql.tidb_mdl_info;
  2. The TiDB logs of IP 10.10.10.74, port 4000
| username: cy6301567 | Original post link

How was it resolved in the end?

| username: zhanggame1 | Original post link

Is it restored or not?

| username: TiDBer_vfJBUcxl | Original post link

Did it recover later?