TiDB: If a single table has around 20 million rows and a size of about 8GB, what is the best way to modify the table structure (add a new index)?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb,单个表数据量有2000万左右,表大小8G左右,如果需要修改表结构(新增索引),有什么比较好的方式?

| username: grafanab

[TiDB Usage Environment] Production Environment / Testing
[TiDB Version]
Need to modify the table structure of a certain table (add an index). The data volume is too large, and each modification fails. Is there a better way, similar to tools like pt-online-schema-change?
[Encountered Problems: Problem Phenomenon and Impact]
[Resource Configuration]
image

| username: 有猫万事足 | Original post link

I suggest you take a look at this. It is a relatively comprehensive test. You can use it as a reference.

| username: danghuagood | Original post link

TiDB itself supports online DDL. Tables with billions of rows and hundreds of gigabytes of data have been successfully altered. You need to check the specific reason for the failure and make corrections based on the error message.

| username: zhaokede | Original post link

It’s not a table that is accessed very frequently in the production system, you can operate on it directly. 8GB in size and 20 million records are not that large.

| username: 像风一样的男子 | Original post link

The data volume is not too large, it can be directly modified.

| username: grafanab | Original post link

The main issue is time; a long interval will affect the original business.

| username: DBAER | Original post link

Normal online DDL should not have any impact.

| username: Kongdom | Original post link

First, let’s talk about how to change the table structure. The cost varies depending on what you are changing.
Also, 20 million records is not a large amount of data, so adjusting the table structure won’t be too slow.

| username: onlyacat | Original post link

First, check which version you are using.
But generally, you can just make the changes directly.
Monitor the situation in real-time.

| username: grafanab | Original post link

Add an index

| username: WinterLiu | Original post link

Find a free time slot and add it directly.

| username: Kongdom | Original post link

Adding an index in v7.0 should be very fast.

Here is our real case, and we are using a virtual machine with mechanical disks.

The version before the upgrade was v5.1.0, database performance: adding an index was particularly slow, slightly larger tables took tens of minutes or even more than an hour.
The version after the upgrade is v6.5.4, database performance: adding an index to a table with 160 million records took 192 seconds, about 3 minutes.

| username: Trouble | Original post link

Add an index, 8GB directly added.

| username: 路在何chu | Original post link

I have added columns and indexes directly to tables with hundreds of millions of rows without any interruptions.

| username: tidb菜鸟一只 | Original post link

These two parameters can be adjusted to be larger.

| username: forever | Original post link

How large is the disk space? Could it be failing due to insufficient space? It should be quite fast for around 20 million.

| username: zhanggame1 | Original post link

Could the failure be due to the TiDB node not having temporary space configured?

| username: porpoiselxj | Original post link

For data of this scale, adding an index directly should be no problem. If it keeps getting stuck or fails, try checking the temporary directory of tidb-server. If it hasn’t been created automatically, create it manually. There is a bug in versions v7.10-v7.1.1. The default path is:
/tmp/tidb/tmp_ddl-4000
chown tidb:tidb -R /tmp/tidb/tmp_ddl-4000

| username: YuchongXU | Original post link

Supports online DDL

| username: tony5413 | Original post link

What version of TiDB are you using? What error is being reported? Are there any logs? It is best to make changes during off-peak business hours.