DDL Stuck and Unresponsive

translator_bot · June 23, 2024, 4:47am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: DDL卡住不动

| username: wakaka

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.2.2
[Encountered Problem] DDL execution gets stuck at a certain point for a long time
[Reproduction Path] Creating index DDL
[Problem Phenomenon and Impact]
The table has about 170 million rows of data. Adding an index gets stuck at around 140 million rows for half an hour without moving.

Checked the DDL worker logs and found no errors

[Attachment]

translator_bot · June 23, 2024, 4:47am

| username: forever | Original post link

First, check if there are too many data versions and if the GC hasn’t cleaned them up. There was a post discussing this issue a few days ago:
900 million table adding index hasn’t finished executing after more than a day, how to troubleshoot? - TiDB - TiDB Q&A Community (asktug.com)

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

At that time, the GC was set to 10 minutes, and there weren’t a large number of data versions.

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

Following the post’s instructions didn’t work either.

translator_bot · June 23, 2024, 4:47am

| username: forever | Original post link

Have you resolved it?

translator_bot · June 23, 2024, 4:47am

| username: HACK | Original post link

It feels like this problem is quite common. I’ve seen several people experiencing this issue.

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

When adding an index to a table with existing data, there will be an index rebuilding step. If it’s too slow, you can adjust the backfill speed. However, this will have a significant performance impact on the cluster…

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

Tried 3 times, adjusted the parameters, but it didn’t work. It always gets stuck at the screenshot part and doesn’t proceed.

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

Cancelled, try again.

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

Tried 3 times, waited for several hours each time, but it didn’t work.

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

Create a new table with the same structure, build all the indexes, move the data over… then delete the original table, and rename the new one…

translator_bot · June 23, 2024, 4:47am

| username: alfred | Original post link

Is it possible to do session tracking or Linux process tracking to see where it is stuck?

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

I feel that this issue is still affected by a bug. If possible, upgrade to 5.2.4.
The method I mentioned can help you get past it…
But to completely resolve it, you still need to upgrade…

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

Will our upgrade introduce any new issues? It seems like all the problems I’m encountering require an upgrade to resolve. I checked the 5.2.4 fix list and didn’t see this bug being fixed.

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

These are all old issues with data processing

I recommend upgrading to 5.2.4, as it has fixed some known issues.

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

I am concerned that upgrading from my previous version 5.0.6 to a version with a large gap might introduce additional bugs. I am not sure which version would be appropriate to upgrade to.

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

If you’re concerned, you can set up a resource and run a POC to base your evaluation on the results.

translator_bot · June 23, 2024, 4:47am

| username: wakaka | Original post link

The time and cost are also an issue, especially for a large cluster over 50T. Adding an index alone is too big of an action.

translator_bot · June 23, 2024, 4:47am

| username: xiaohetao | Original post link

translator_bot · June 23, 2024, 4:47am

| username: xfworld | Original post link

Just sorted out, you can refer to it

TiDB 的问答社区 – 28 Jul 22

【TiDB 社区智慧合集】2022 社区用户升级指南&专栏升级文章大全

🌃 资源中心社区智慧合集

准备升级的小伙伴们有福啦！表妹给你准备了一份超级齐全的升级指南，大家可以借鉴一下： 1️⃣实战-记录一次大版本升级（4.0.13→5.4.0）一. 背景目前本公司TiDB集群已经运行5个业务系统数据库。这5个业务都是公司相对重要的业务系统，具有高并发写入、高并发查询或大批量数据查询的特征。 TiDB产品迭代速度较快，TiDB v5.4.0 GA版本虽然比TiDB...

阅读时间: 1 mins 🕑 赞: 2 ❤