Single Table Query on the Same Table: Sorting by Release Time Takes 200ms, Sorting by ID Takes Less Than 50ms. Why is There Such a Big Difference? (Both Release Time and ID are Indexed)

translator_bot · June 22, 2024, 3:42am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 单表查询同一个表：按照发布时间排序，执行时间200ms；按照id排序执行时间小于50ms。为啥差不这么大？（发布时间和id都做了索引）

| username: xiaoxiaozuofang

[TiDB Usage Environment] Production Environment
[TiDB Version] tidb v6.1.0
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Querying the same table: sorting by publication time takes 200ms; sorting by id takes less than 50ms. Why is there such a big difference? (Both publication time and id have indexes)]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

translator_bot · June 22, 2024, 3:42am

| username: MrSylar | Original post link

It seems that the optimizer is trying to eliminate the sorting operation, so the execution plan above uses the index on the publish_time column for an index full scan, loop 542, while the execution plan below uses a table full scan, loop 10.

translator_bot · June 22, 2024, 3:42am

| username: zhanggame1 | Original post link

The execution plans are different; it uses the index when sorted by time.

translator_bot · June 22, 2024, 3:42am

| username: caiyfc | Original post link

First, analyze this table. An incorrect execution plan can cause it to be slow.

translator_bot · June 22, 2024, 3:42am

| username: Kongdom | Original post link

Why did the first one have two selections? Please share the table’s index.

translator_bot · June 22, 2024, 3:42am

| username: ealam_小羽 | Original post link

The first scanned index also seems to be quite large, 550,000.
Has it been consistently reproducible? Have you encountered other slow SQL queries affecting the current query before?

translator_bot · June 22, 2024, 3:42am

| username: Jellybean | Original post link

The column used in the ORDER BY clause affects the choice of index. Try forcing both queries to use the publish_time or id index and compare the results to see the difference.

translator_bot · June 22, 2024, 3:42am

| username: redgame | Original post link

The execution plans are different, and the differences are significant.

translator_bot · June 22, 2024, 3:42am

| username: tidb菜鸟一只 | Original post link

When sorting by time, the time field index was used, which is problematic. Sorting by ID resulted in a full table scan, which is fine. The index actually reduced the query time. You can collect the table’s statistics or use a hint to specify not to use the time index.

translator_bot · June 22, 2024, 3:42am

| username: Sean007 | Original post link

The results of the two SQL queries are different, right? From the execution plan, the order of publish_time and id should be inconsistent. The two different sorting conditions lead to different numbers of scanned rows, resulting in different execution times.

translator_bot · June 22, 2024, 3:42am

| username: xiaoxiaozuofang | Original post link

With 5 KV nodes (8 cores * 32G Aliyun ECS), the data volume of the news table is only 4 million. Single table queries shouldn’t be this slow, right? The difference in query time when filtering by publish_time and id is too significant. Does anyone know the reason?

translator_bot · June 22, 2024, 3:42am

| username: TiDB_C罗 | Original post link

It should be consistent. Other filtering conditions remain unchanged, just fetch the latest few rows of data based on different fields. The 200ms one involves a table lookup, while the 50ms one directly returns the required rows of data.

translator_bot · June 22, 2024, 3:42am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.