DDL is stuck, show command not working

translator_bot · June 21, 2024, 1:09am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ddl卡住，show执行不出来

| username: xie123

[TiDB Usage Environment] Production Environment / Test / Poc
[TiDB Version]
V6.5.5
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issues: Issue Phenomenon and Impact]

Dropping a table gets stuck
ADMIN SHOW DDL; shows a DDL SQL that has been stuck for a long time (saw two entries, first add index running, then drop table)

|404 | 2be432ce-525a-49df-8e46-f53c1f10888d |xx:4000 | 

ID:528, Type:drop table, State:queueing, SchemaState:public, SchemaID:90, TableID:444, RowCount:0, ArgLen:0, start time: 2024-03-06 16:29:56.586 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0
ID:526, Type:add index, State:running, SchemaState:write reorganization, SchemaID:90, TableID:444, RowCount:0, ArgLen:0, start time: 2024-03-06 16:12:13.186 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:448193549109821543, UniqueWarnings:0 

| 3841a51f-6760-4dc0-a260-8ce125d06f6f 
| DROP TABLE IF EXISTS `table_name`
/* ApplicationName=DataGrip 2023.3.4 */ CREATE INDEX xx ON table_name (xx, xx)

ADMIN SHOW DDL JOBS; reports an error

ERROR 1105 (HY000): tikv aborts txn: Error(InvalidKeyRangeMode { cmd: scan, storage_api_version: V2, range: (Some("6D44444C4A6F6248FF69FF73746F727900FF0000FC0000000000FF0000690000000000FA"), None) })

Find the job ID related to this table in all TiDB logs and cancel the job

 ADMIN CANCEL DDL JOBS 521,522,523,524,525,526,528,529,533,534,535,536,537,538,541,542,543;
±-------±---------------------------------------+
| JOB_ID | RESULT |
±-------±---------------------------------------+
| 521 | error: [ddl:8224]DDL Job:521 not found |
| 522 | error: [ddl:8224]DDL Job:522 not found |
| 523 | error: [ddl:8224]DDL Job:523 not found |
| 524 | error: [ddl:8224]DDL Job:524 not found |
| 525 | error: [ddl:8224]DDL Job:525 not found |
| 526 | successful |
| 528 | successful |
| 529 | successful |
| 533 | successful |
| 534 | successful |
| 535 | successful |
| 536 | successful |
| 537 | successful |
| 538 | successful |
| 541 | successful |
| 542 | successful |
| 543 | successful |
±-------±---------------------------------------+
ps: Previously, I didn't notice there were two jobs, canceled the drop table DDL job, the stuck add index is still running, rename is stuck
5. After canceling all the above jobs, continue to rename this table, succeeded after about 8 minutes

Seeking help:

Want to know why adding an index with only 2 rows of data got stuck
Is ADMIN SHOW DDL JOBS stuck because of storage_api_version v2, can it be rolled back?

[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

translator_bot · June 21, 2024, 1:09am

| username: changpeng75 | Original post link

The Job queue for online DDL is persisted in TiKV, so after a restart, the Job should still run. Was the Job canceled later?

translator_bot · June 21, 2024, 1:09am

| username: xie123 | Original post link

It seems that there was an issue with adding an index when using API version 2.

translator_bot · June 21, 2024, 1:09am

| username: dba远航 | Original post link

The process of adding an index got stuck with just 2 records, which is probably not related to the data volume. There might be some blockage.

translator_bot · June 21, 2024, 1:09am

| username: FutureDB | Original post link

Did you change the value of the API version? The default value was 1 before.

translator_bot · June 21, 2024, 1:09am

| username: 有猫万事足 | Original post link

I checked, and indeed this api-version=2 is a bit strange. My own version is 7.5.1, and this value is set to 1.

translator_bot · June 21, 2024, 1:09am

| username: xie123 | Original post link

To reproduce, use API version 2, versions before 6.5.5. After creating an empty table and adding an index, the issue can be reproduced. There is a similar fix in the community, and version 6.5.6 does not have this problem. However, API version 2 is really troublesome…

translator_bot · June 21, 2024, 1:09am

| username: Hacker_QGgM2nks | Original post link

It seems that there are issues with the lower versions. We are using 5.3 and still have the problem of residual processes after killing them.

translator_bot · June 21, 2024, 1:09am

| username: 芝士改变命运 | Original post link

What is the reason?

translator_bot · June 21, 2024, 1:09am

| username: TiDBer_HErMeXDz | Original post link

Upgrade…

translator_bot · June 21, 2024, 1:09am

| username: Hacker_xUwtuKxa | Original post link

I didn’t understand the question. If the add index operation is not completed and the subsequent drop table operation gets stuck, it’s normal behavior. How is it a bug?

translator_bot · June 21, 2024, 1:09am

| username: 呢莫不爱吃鱼 | Original post link

What is the issue?

translator_bot · June 21, 2024, 1:09am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.