Stuck Adding Index to Large Table

translator_bot · June 20, 2024, 4:17pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 大表添加索引卡住

| username: Kamner

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] Adding index with create index
[Encountered Problem: Problem Phenomenon and Impact]
The table size is around 200G, with 26,741,006 rows. It has been running for almost a day without any progress.

translator_bot · June 20, 2024, 4:17pm

| username: hacker_77powerful | Original post link

What are the specifications of the host?

translator_bot · June 20, 2024, 4:17pm

| username: xfworld | Original post link

Upgrade to 7.x (6.5.x seems to support it as well), which supports fastreorg, and this issue will be resolved.

Previous versions did have this problem…

translator_bot · June 20, 2024, 4:17pm

| username: Kamner | Original post link

So now we can only wait, is there no other way?

translator_bot · June 20, 2024, 4:17pm

| username: xfworld | Original post link

5.x is single-threaded, indeed very slow…
Just have to wait

Or, create a new table, set up the indexes properly, and then fill it with data to replace the original table.

translator_bot · June 20, 2024, 4:18pm

| username: Kamner | Original post link

Seeing that ROW_COUNT remains unchanged, is it stuck?

translator_bot · June 20, 2024, 4:18pm

| username: TiDBer_HUfcQIJx | Original post link

Wait a moment.

translator_bot · June 20, 2024, 4:18pm

| username: yiduoyunQ | Original post link

As long as ROW_COUNT is increasing, it’s fine. Hopefully, it can increase faster by using version 6.5 or above to support tidb_ddl_enable_fast_reorg. If it doesn’t increase at all, check if there are any known issues with the version you are using.

translator_bot · June 20, 2024, 4:18pm

| username: TIDB-Learner | Original post link

Be patient and do not cancel the operation easily.

translator_bot · June 20, 2024, 4:18pm

| username: dba-kit | Original post link

Check the tidb-server logs to see if there are any errors.

translator_bot · June 20, 2024, 4:18pm

| username: Kamner | Original post link

The server logs do not report any errors, and tikv_stderr.log is empty.

It seems that tikv.log is unrelated.

translator_bot · June 20, 2024, 4:18pm

| username: TiDBer_QYr0vohO | Original post link

Yes, as long as ROW_COUNT is changing, there is no problem.

translator_bot · June 20, 2024, 4:18pm

| username: 柴米油酱 | Original post link

As long as ROW_COUNT is changing, or upgrade to a version that supports the parameter tidb_ddl_enable_fast_reorg.

translator_bot · June 20, 2024, 4:18pm

| username: TiDBer_H5NdJb5Q | Original post link

Waiting for an answer, I no longer dare to add an index to a large table.

translator_bot · June 20, 2024, 4:18pm

| username: Kongdom | Original post link

I have experienced this situation and ultimately chose to upgrade the version.

Before the upgrade, the version was v5.1.0. Database performance: Adding indexes was particularly slow, taking tens of minutes or even over an hour for slightly larger tables.
After the upgrade, the version is v6.5.4. Database performance: Adding an index to a table with 160 million records took 192 seconds, around 3 minutes.

translator_bot · June 20, 2024, 4:18pm

| username: FutureDB | Original post link

Is adding an index to hundreds of millions of data in version 6.5.4 really that fast?

translator_bot · June 20, 2024, 4:18pm

| username: TIDB-Learner | Original post link

It also depends on the specific situation. Look at the number and types of fields.

translator_bot · June 20, 2024, 4:18pm

| username: Kongdom | Original post link

This is actual test data with the same DDL statement. It was indeed quite shocking at the time~

translator_bot · June 20, 2024, 4:18pm

| username: Kamner | Original post link

Finally, it was resolved as follows:
Reference:

Disable Compaction Filter to speed up GC

After disabling, observe the GC situation of this table

explain analyze select * from table_name;

total_process_keys: 27467763, total_process_keys_size: 15794369502, total_keys: 842294971
total_keys/total_process_keys=842294971/27467763=30.67

The higher the value of total_keys/total_process_keys, the higher the GC delay, and the more mvcc versions, which will cause more data blocks to be scanned when building the index.

After the modification, the GC delay was reduced the next day, and the DDL for adding the index was also completed
total_process_keys: 26354251, total_process_keys_size: 15873718557, total_keys: 57310326
total_keys/total_process_keys=57310326/26354251=2.17

Thanks to everyone for their support online and offline @h5n1 @hey-hoho

translator_bot · June 20, 2024, 4:18pm

| username: Kamner | Original post link

Additionally, refer to this post: