Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 主键从int改为bigint
【TiDB Usage Environment】Production Environment
【TiDB Version】5.4.0
【Reproduction Path】What operations were performed when the issue occurred
【Encountered Issue: Issue Phenomenon and Impact】
Need to change the primary key of a table with 180 million rows from int to bigint. Are there any precautions? Will it lock the table?
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】
This belongs to physical DDL. The characteristics of physical DDL are that it takes a long time to execute, and the execution time is related to the amount of data in the table, machine configuration, and business load. Executing physical DDL will affect the business load in two specific ways. On one hand, it needs to read data from TiKV and write new data, which will consume TiKV’s CPU and I/O resources. On the other hand, the TiDB node where the DDL Owner is located needs to perform corresponding calculations, thus consuming more CPU resources.
When executing physical DDL (including adding indexes or changing column types), appropriately adjusting the following system variables can balance the DDL execution speed and its impact on business load:
tidb_ddl_reorg_worker_cnt
: This variable is used to set the number of reorg workers for DDL operations, controlling the concurrency of backfilling.
tidb_ddl_reorg_batch_size
: This variable is used to set the batch size during the re-organize
phase of DDL operations to control the amount of data backfilled.
Recommended values:
- In the absence of other loads, if you need to complete
ADD INDEX
as soon as possible, you can increase the values of tidb_ddl_reorg_worker_cnt
and tidb_ddl_reorg_batch_size
appropriately, for example, setting the two variables to 20
and 2048
respectively.
- In the presence of other loads, if you need to minimize the impact of
ADD INDEX
on other business operations, you can decrease the values of tidb_ddl_reorg_worker_cnt
and tidb_ddl_reorg_batch_size
appropriately, for example, setting the two variables to 4
and 256
respectively.
There is also a basic pitfall to remind you of:
- When multiple DDL statements are executed together, the latter DDL statements will be relatively slow. The reason is that DDL operations in the current TiDB cluster are executed serially.
集群管理 FAQ | PingCAP 文档中心
Do it during off-peak business hours.
It is recommended to upgrade the version before making changes, as the new version changes particularly quickly.
That’s true.
However, upgrading the version just to change a character type seems a bit excessive, haha.
Just add it during a low business peak period.
Yes, during off-peak business hours, start running it, then increase the backfill parameters to make it run faster, and it will be fine.
Choose a low-traffic period to handle it.
It is estimated that the time will be very long, so it is recommended to have an expectation. It is suggested to increase the tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size parameters during off-peak periods for backfill execution.
Additionally, the rollback might also take a long time.
Thank you for the explanation.
It depends on personal choice. In our project, if it feels slow, we just upgrade to a new version, and it becomes much faster. The client is very satisfied. 
In TiDB, all supported DDL changes are executed online, so they shouldn’t be slow, right?