Creating Index on an Empty Table

translator_bot · June 22, 2024, 11:55pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 空表建索引

| username: 胡杨树旁

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
A new table was created, and I wanted to add an index. I found that adding the index requires about 3M. I would like to ask what the principle of adding an index is? If adding an index requires traversing the entire table data, then an empty table with no data shouldn’t require 3M, right?

translator_bot · June 22, 2024, 11:55pm

| username: 胡杨树旁 | Original post link

The correct time for creating an index on an empty table is 3 seconds.

translator_bot · June 22, 2024, 11:55pm

| username: buddyyuan | Original post link

You can take a look at this

github.com/pingcap/tidb

ddl, tikv: add delay during AddIndex DDL and remove schema check for async commit

pingcap:master ← sticnarf:add-index-delay

opened 02:48AM - 21 Oct 20 UTC

sticnarf

+164 -75

### What problem does this PR solve? Issue Number: #20531 but better not clos…e it IMO because it's not a perfect solution ### What is changed and how it works? Using the proposal 1 described in #20531, this PR adds a 2.5 seconds delay to before running reorganization. Before an async-commit transaction prewrites, we try to amend the transaction first. Then, we will have 2 seconds to prewrite the mutations. (0.5 second is for clock drift) A 2 second later `max_commit_ts` is set to the prewrite request. If the calculated `commit_ts` exceeds it, the transaction should fail. ### Check List Tests - Integration test Side effects - Performance regression - AddIndex DDL will wait at least 4 seconds. ### Release note - Part of the async commit feature

translator_bot · June 22, 2024, 11:55pm

| username: 胡杨树旁 | Original post link

Okay, thank you.

translator_bot · June 22, 2024, 11:55pm

| username: alfred | Original post link

Creating an index on an empty table should be instantaneous. How were the system resources at that time?

translator_bot · June 22, 2024, 11:55pm

| username: 近墨者zyl | Original post link

Creating an index on an empty table still requires scanning the region where the table is located.
An index is a DDL statement. If the TiDB server initiating the index creation is a worker, it will first find the owner TiDB server, parse the create index statement into a job, and place it in the add index queue to be executed serially with all other index creation statements.
Creating an index requires updating schema information and statistics. The schema information needs to be written to TiKV and also requires PD to notify TiDB to update the schema cache.

translator_bot · June 22, 2024, 11:55pm

| username: 胡杨树旁 | Original post link

There are no resource bottlenecks, CPU load and IO are both below 10%, but creating an index is even slower than creating a table…

translator_bot · June 22, 2024, 11:55pm

| username: Min_Chen | Original post link

I reproduced it, and it indeed takes 3 seconds. Checking the logs, the main time consumption is on this log entry.

[2022/11/16 02:18:19.173 +00:00] [INFO] [ddl.go:1230] ["sleep before DDL finishes to make async commit and 1PC safe"] [duration=2.5s]

Due to the particularity and complexity of TiDB DDL, it is indeed much slower compared to other databases. However, DDL is generally a one-time task and can be mostly ignored. Additionally, you can refer to the reply in the following post to see if it helps: TIDB为什么给一个空表创建索引会很慢呢 - #7，来自 centosredhat - TiDB 的问答社区
Also, there is a blog explaining the DDL principles: TiDB 源码阅读系列文章（十七）DDL 源码解析 | PingCAP

translator_bot · June 22, 2024, 11:55pm

| username: 胡杨树旁 | Original post link

Could you please tell me which log this is referring to?

translator_bot · June 22, 2024, 11:55pm

| username: Min_Chen | Original post link

Hello, it’s the tidb-server log tidb.log.

translator_bot · June 22, 2024, 11:55pm

| username: 胡杨树旁 | Original post link

Okay, thank you.

translator_bot · June 22, 2024, 11:55pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.