Performance Issues with Execution Plan Distinct

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 执行计划distinct性能问题

| username: 人如其名

[TiDB Usage Environment] Poc
[TiDB Version] 6.2
[Encountered Problem] Index field performs IndexFullScan when using distinct
[Problem Phenomenon and Impact]
When performing the query select distinct O_ORDERDATE from orders limit 10, why does it still perform IndexFullScan even though there is an index on O_ORDERDATE?
distinct_problem.txt (5.1 KB)

| username: h5n1 | Original post link

This is normal. Performing a distinct operation on a non-primary key column definitely requires scanning all the data to aggregate, so indexfullscan is chosen; otherwise, it would be fulltablescan. The agg operation has already been pushed down to TiKV. It’s unclear whether limit pushdown supports the aggregated results; theoretically, it would be better if each cop task aggregates, then processes the limit, and then returns the results. Additionally, there is a significant difference between est rows and act rows. Try manually collecting the statistics and then run it again to see.

| username: 人如其名 | Original post link

In a traditional B+ tree, performing a distinct limit 10 operation directly using the index takes at most a few milliseconds (with the index leaf in memory), as it only needs to scan a small number of index pages to get the result. Here, LSM tree is used, and TiDB implements secondary indexes by concatenating the index field and rowid as the key. So, when I want to find distinct keys, can’t I perform a jump search? If a sequential index key scan within the range occurs, it should stop after scanning up to the limit 10 results, right? It doesn’t seem necessary to scan the entire index.

| username: h5n1 | Original post link

It seems that there is no such skipping scan in RocksDB; it scans sequentially until the condition is not met. I am not sure if InnoDB handles this as you mentioned. The limit has to be executed after the aggregation, so each cop task can only apply the limit after completing the aggregation. It cannot apply limit 10 within the cop task. I think the optimization point is to apply the limit on the aggregated data within TiKV before returning. It seems like this is supported according to the issue below: Aggregation with sort and limit is not push down sort and limit parameter to TiKV · Issue #3938 · pingcap/tidb · GitHub, but there might be some conditions or restrictions.

| username: 特雷西-迈克-格雷迪 | Original post link

IndexFullScan is normal, “at most it will stop after scanning out the limit 10 results” is also correct (so it will return results in milliseconds), and indeed it does not need to scan all the results, so this is abnormal. Why must the index leaf be in memory? MySQL does have jump searches (but with conditions), I’ll post it in a moment.

| username: 特雷西-迈克-格雷迪 | Original post link

https://dev.mysql.com/doc/refman/8.0/en/group-by-optimization.html, ‘Skip Scan’ is only meaningful for group by with more than two fields.

| username: 人如其名 | Original post link

In B+ trees, indexes are originally designed to be unique, so retrieval is fast. However, in TiDB, the secondary index keys all include RIDs, leading to extensive scanning. – To correct this, secondary indexes are not unique.

| username: 人如其名 | Original post link

I may have expressed it a bit poorly. I meant to say assuming the index leaves are all in memory. The main point is that TiDB scans more pages when dealing with distinct index fields, which I think should be avoided as much as possible.

| username: forever | Original post link

Secondary indexes are not deduplicated, if I remember correctly.

| username: 人如其名 | Original post link

Yes, secondary indexes are not deduplicated, I made a mistake. However, the scanning method of bptree using range scan does not require a full index scan.

| username: forever | Original post link

This is what h5n1 mentioned. Because the limit is not pushed down, all data is scanned from the TiKV index before applying the limit. If the limit could be pushed down, it should be a range scan.

| username: forever | Original post link

This is not much related to skip scan. Skip scan is used when the leading column of an index cannot be used, and the selectivity of the leading column is very high.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.