What is the unit of tidb_ddl_reorg_batch_size?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb_ddl_reorg_batch_size的单位是什么

| username: TiDBer_iLonNMYE

Fundamental question: What is the unit of tidb_ddl_reorg_batch_size? I didn’t see it in the manual.
Is it Bytes, the number of Keys, or the number of Regions?

| username: 啦啦啦啦啦 | Original post link

None of them, it’s based on batches.

| username: TiDBer_iLonNMYE | Original post link

For example, if tidb_ddl_reorg_batch_size=256, then during reorg, what are the 256 “units” used to backfill index data?

| username: 啦啦啦啦啦 | Original post link

256 batches, that is, 256 batches. tidb_ddl_reorg_batch_size is used in conjunction with tidb_ddl_reorg_worker_cnt.
The speed of backfilling data mainly depends on the product of tidb_ddl_reorg_worker_cnt and tidb_ddl_reorg_batch_size.
You can refer to this

| username: TiDBer_iLonNMYE | Original post link

Got it. When backfilling data, each reorg_worker processes batch_size batches of data, right? So, is each batch measured in terms of “how many rows” of Keys, or some other unit? That’s my question.

| username: h5n1 | Original post link

tidb_ddl_reorg_batch_size: Unit: Rows

| username: TiDBer_iLonNMYE | Original post link

Got it, this is the answer I needed! Thanks a lot.
Can you provide a link? I didn’t see unit: rows.

| username: h5n1 | Original post link

The tidb_ddl_reorg_batch_size system variable specifies the batch size of data that is processed in each round of DDL reorganization. This variable can be used to control the speed of DDL operations. The default value is 256.

| username: TiDBer_iLonNMYE | Original post link

Thank you very much. I didn’t see it in the Chinese version of the manual, so I asked. It would be best if the relevant team could keep the Chinese and English versions consistent.

| username: 啦啦啦啦啦 | Original post link

The English version is actually the standard answer, learned something new :+1: