Execution of TRUNCATE statement is stuck

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 执行truncate 语句卡住

| username: TiDBer_vFs1A6CZ

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.1
[Reproduction Path] What operations were performed when the problem occurred: No operations were performed
[Encountered Problem: Problem Phenomenon and Impact] Executing truncate on a table, the operation statement gets stuck, and subsequent scheduling tasks cannot be executed normally
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page

[Attachments: Screenshots/Logs/Monitoring]
Monitoring DDL execution status

Attempted Solutions

  1. ADMIN CANCEL DDL JOBS related job_id
  2. – Execute tidb owner re-election
    curl -X POST http://xxxx:xx/ddl/owner/resign

Current Status
The truncate operation on this table is stuck, causing subsequent DDL and DML statements to be unexecutable. How should this be resolved?

| username: h5n1 | Original post link

Restart method: shut down all TiDB nodes, then start them again.

| username: 昵称想不起来了 | Original post link

Restart after closing everything.

| username: 大飞哥online | Original post link

SELECT ID, USER, INSTANCE, INFO FROM INFORMATION_SCHEMA.CLUSTER_PROCESSLIST; ------Query all active queries in the current cluster, find the id of the truncate table, then kill the id. Take a look.

| username: 大飞哥online | Original post link

ADMIN CANCEL DDL JOBS job_id doesn’t work either, then just restart the TiDB node :joy:

| username: TiDBer_vFs1A6CZ | Original post link

I have tried using kill id, but it doesn’t work very well.

| username: TiDBer_vFs1A6CZ | Original post link

Is it necessary to restart the leader node, or restart all TiDB server nodes one by one, or shut down all TiDB server nodes and then restart them? Currently, there are 50,000 scheduling tasks running on TiDB every day. How can we minimize the impact as much as possible?

| username: TiDBer_vFs1A6CZ | Original post link

Are there any other methods? Shutting down all TiDB servers would have too much of an impact on the production environment.

| username: redgame | Original post link

Restart after closing everything.

| username: zhanggame1 | Original post link

There’s no better solution, you can restart with tiup cluster restart tidb-test -R tidb.

| username: 像风一样的男子 | Original post link

Restart command: tiup cluster restart clustername -R tidb

| username: 大飞哥online | Original post link

At that time, did you check for any zombie processes on the TiDB server using the ps command with the l parameter? The second column from the left should display a ‘Z’.

| username: 大飞哥online | Original post link

Restarting in a polling manner is also acceptable.

| username: TiDBer_vFs1A6CZ | Original post link

Checked, no zombie processes.

| username: TiDBer_vFs1A6CZ | Original post link

The current answer is that we can only restart.

| username: 大飞哥online | Original post link

So do you shut them all down one by one and then start them up one by one, or do you restart each TiDB instance individually?

| username: TiDBer_vFs1A6CZ | Original post link

Haven’t restarted yet, is there any other good solution?

| username: 大飞哥online | Original post link

Restart it :joy:

| username: 大飞哥online | Original post link

ADMIN CANCEL and kill are not working, it seems that we can only resort to restarting.

| username: TiDBer_vFs1A6CZ | Original post link

Sequentially restarted the TiDB server nodes, but the DDL task is still stuck and has not been canceled.