Creating Index on an Empty Table

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 空表建索引

| username: 胡杨树旁

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
A new table was created, and I wanted to add an index. I found that adding the index requires about 3M. I would like to ask what the principle of adding an index is? If adding an index requires traversing the entire table data, then an empty table with no data shouldn’t require 3M, right?

| username: 胡杨树旁 | Original post link

The correct time for creating an index on an empty table is 3 seconds.

| username: buddyyuan | Original post link

You can take a look at this

| username: 胡杨树旁 | Original post link

Okay, thank you.

| username: alfred | Original post link

Creating an index on an empty table should be instantaneous. How were the system resources at that time?

| username: 近墨者zyl | Original post link

  1. Creating an index on an empty table still requires scanning the region where the table is located.
  2. An index is a DDL statement. If the TiDB server initiating the index creation is a worker, it will first find the owner TiDB server, parse the create index statement into a job, and place it in the add index queue to be executed serially with all other index creation statements.
  3. Creating an index requires updating schema information and statistics. The schema information needs to be written to TiKV and also requires PD to notify TiDB to update the schema cache.
| username: 胡杨树旁 | Original post link

There are no resource bottlenecks, CPU load and IO are both below 10%, but creating an index is even slower than creating a table…

| username: Min_Chen | Original post link

I reproduced it, and it indeed takes 3 seconds. Checking the logs, the main time consumption is on this log entry.

[2022/11/16 02:18:19.173 +00:00] [INFO] [ddl.go:1230] ["sleep before DDL finishes to make async commit and 1PC safe"] [duration=2.5s]

Due to the particularity and complexity of TiDB DDL, it is indeed much slower compared to other databases. However, DDL is generally a one-time task and can be mostly ignored. Additionally, you can refer to the reply in the following post to see if it helps: TIDB为什么 给一个空表创建索引会很慢呢 - #7,来自 centosredhat - TiDB 的问答社区
Also, there is a blog explaining the DDL principles: TiDB 源码阅读系列文章(十七)DDL 源码解析 | PingCAP

| username: 胡杨树旁 | Original post link

Could you please tell me which log this is referring to?

| username: Min_Chen | Original post link

Hello, it’s the tidb-server log tidb.log.

| username: 胡杨树旁 | Original post link

Okay, thank you.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.