Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 单表查询同一个表:按照发布时间排序,执行时间200ms;按照id排序执行时间小于50ms。为啥差不这么大?(发布时间和id都做了索引)
[TiDB Usage Environment] Production Environment
[TiDB Version] tidb v6.1.0
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Querying the same table: sorting by publication time takes 200ms; sorting by id takes less than 50ms. Why is there such a big difference? (Both publication time and id have indexes)]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
It seems that the optimizer is trying to eliminate the sorting operation, so the execution plan above uses the index on the publish_time column for an index full scan, loop 542, while the execution plan below uses a table full scan, loop 10.
The execution plans are different; it uses the index when sorted by time.
First, analyze this table. An incorrect execution plan can cause it to be slow.
Why did the first one have two selections? Please share the table’s index.
The first scanned index also seems to be quite large, 550,000.
Has it been consistently reproducible? Have you encountered other slow SQL queries affecting the current query before?
The column used in the ORDER BY clause affects the choice of index. Try forcing both queries to use the publish_time or id index and compare the results to see the difference.
The execution plans are different, and the differences are significant.
When sorting by time, the time field index was used, which is problematic. Sorting by ID resulted in a full table scan, which is fine. The index actually reduced the query time. You can collect the table’s statistics or use a hint to specify not to use the time index.
The results of the two SQL queries are different, right? From the execution plan, the order of publish_time and id should be inconsistent. The two different sorting conditions lead to different numbers of scanned rows, resulting in different execution times.
With 5 KV nodes (8 cores * 32G Aliyun ECS), the data volume of the news table is only 4 million. Single table queries shouldn’t be this slow, right? The difference in query time when filtering by publish_time and id is too significant. Does anyone know the reason?
It should be consistent. Other filtering conditions remain unchanged, just fetch the latest few rows of data based on different fields. The 200ms one involves a table lookup, while the 50ms one directly returns the required rows of data.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.