The DDL statement cancellation is stuck in cancelling

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 取消DDL语句一直在cancelling

| username: SummerGu

[TiDB Usage Environment] Production Environment
[TiDB Version] Upgraded from 5.4 to 7.5

It has been upgraded for several months. A few days ago, it suddenly became impossible to create tables. By using ADMIN SHOW DDL JOBS to check the blocked DDL information, and then using ADMIN CANCEL DDL JOBS xxx,xxx,xxx… to cancel all pending DDLs, but it remains in the cancelling state.

I also tried the method of electing a new owner as described in SQL 操作常见问题 | PingCAP 文档中心, but it still cannot be cancelled. Is there any solution?

Thank you!

| username: forever | Original post link

The most reliable solution at present is to restart all tidb-servers.

| username: DBAER | Original post link

It seems like in many of these scenarios, a restart is required.

| username: 江湖故人 | Original post link

Try the expert’s method:

  1. Check mysql.tidb_mdl_view and kill blocking processes.
  2. Shut down all TiDB servers, then restart them.
| username: TiDBer_5cwU0ltE | Original post link

TiDB does not seem to have event tracing functionality similar to Oracle Event Trace. If conditions permit, try to find some downtime to restart the node.

| username: 这里介绍不了我 | Original post link

Take a look at this issue.

| username: 小于同学 | Original post link

The magic of rebooting

| username: redgame | Original post link

Restart the TiDB server

| username: Kongdom | Original post link

Indeed, generally restarting the TiDB node is more reliable. Note that it is restarting the TiDB node, not the TiDB cluster.

| username: zhanggame1 | Original post link

Restart all TiDB servers.

| username: 小龙虾爱大龙虾 | Original post link

Go check the DDL owner logs.

| username: zhaokede | Original post link

Try restarting during idle time to see if it can be restored.

| username: FutureDB | Original post link

Is this a bug in TiDB? Although restarting works, you can’t just restart in a production environment.

| username: TiDBer_aaO4sU46 | Original post link

Restart all TiDB servers.

| username: kelvin | Original post link

You can’t just restart in a production environment, right? Let’s see if any developers can take a look.

| username: 哈喽沃德 | Original post link

Just wait, the data volume is probably too large, waiting for rollback.

| username: 随便改个用户名 | Original post link

Is it still locking the metadata? Try killing any other processes that are executing as well.

| username: 连连看db | Original post link

The upgrade process is not user-friendly, which discourages many users :face_exhaling:

| username: 饭光小团 | Original post link

I can handle this. Based on my experience:

  1. Restart all TiDB nodes (I’ve done this once).
  2. Check the logs of the owner node to see if there are any logs related to “schema not sync”. In my experience, there might be large DML operations on a related table on a certain node. Find and kill this DML operation, and it should be resolved.
| username: YuchongXU | Original post link

The art of rebooting