Changing Primary Key from int to bigint

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 主键从int改为bigint

| username: TiDBer_P49NMjIm

【TiDB Usage Environment】Production Environment
【TiDB Version】5.4.0
【Reproduction Path】What operations were performed when the issue occurred
【Encountered Issue: Issue Phenomenon and Impact】

Need to change the primary key of a table with 180 million rows from int to bigint. Are there any precautions? Will it lock the table?

【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】

| username: 大飞哥online | Original post link

This belongs to physical DDL. The characteristics of physical DDL are that it takes a long time to execute, and the execution time is related to the amount of data in the table, machine configuration, and business load. Executing physical DDL will affect the business load in two specific ways. On one hand, it needs to read data from TiKV and write new data, which will consume TiKV’s CPU and I/O resources. On the other hand, the TiDB node where the DDL Owner is located needs to perform corresponding calculations, thus consuming more CPU resources.

| username: 大飞哥online | Original post link

When executing physical DDL (including adding indexes or changing column types), appropriately adjusting the following system variables can balance the DDL execution speed and its impact on business load:

  • tidb_ddl_reorg_worker_cnt: This variable is used to set the number of reorg workers for DDL operations, controlling the concurrency of backfilling.
  • tidb_ddl_reorg_batch_size: This variable is used to set the batch size during the re-organize phase of DDL operations to control the amount of data backfilled.

Recommended values:

  • In the absence of other loads, if you need to complete ADD INDEX as soon as possible, you can increase the values of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size appropriately, for example, setting the two variables to 20 and 2048 respectively.
  • In the presence of other loads, if you need to minimize the impact of ADD INDEX on other business operations, you can decrease the values of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size appropriately, for example, setting the two variables to 4 and 256 respectively.
| username: 昵称想不起来了 | Original post link

No table locking

In TiDB, all supported DDL changes are performed online.

| username: 昵称想不起来了 | Original post link

Other things to note:

  1. It will take a long time. You can refer to the reply in this post: 变更字段类型非常慢 - TiDB 的问答社区
  2. The version you are using has this issue:
  • Currently, TiDB does not use metadata locks (MDL) to prevent DDL statements from modifying tables used by transactions. If the table definition is changed, committing the transaction will result in an error Information schema is changed. At this point, the transaction will be automatically rolled back.
    COMMIT | PingCAP 文档中心
| username: 昵称想不起来了 | Original post link

There is also a basic pitfall to remind you of:

  • When multiple DDL statements are executed together, the latter DDL statements will be relatively slow. The reason is that DDL operations in the current TiDB cluster are executed serially.
    集群管理 FAQ | PingCAP 文档中心
| username: Fly-bird | Original post link

Execute directly

| username: 大飞哥online | Original post link

Do it during off-peak business hours.

| username: Kongdom | Original post link

It is recommended to upgrade the version before making changes, as the new version changes particularly quickly.

| username: 大飞哥online | Original post link

That’s true.
However, upgrading the version just to change a character type seems a bit excessive, haha.

| username: zhanggame1 | Original post link

Just add it during a low business peak period.

| username: 大飞哥online | Original post link

Yes, during off-peak business hours, start running it, then increase the backfill parameters to make it run faster, and it will be fine.

| username: TiDBer_vfJBUcxl | Original post link

Choose a low-traffic period to handle it.

| username: tidb菜鸟一只 | Original post link

It is estimated that the time will be very long, so it is recommended to have an expectation. It is suggested to increase the tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size parameters during off-peak periods for backfill execution.
Additionally, the rollback might also take a long time.

| username: TiDBer_P49NMjIm | Original post link

Thank you for the explanation.

| username: Kongdom | Original post link

:thinking: It depends on personal choice. In our project, if it feels slow, we just upgrade to a new version, and it becomes much faster. The client is very satisfied. :yum:

| username: ti-tiger | Original post link

In TiDB, all supported DDL changes are executed online, so they shouldn’t be slow, right?