Execution Plan Issue - Low Efficiency of Index Lookup Join

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 执行计划问题-Index Lookup Join效率低下

| username: 人如其名

To improve efficiency, please provide the following information. Clear problem descriptions can be resolved faster:
【TiDB Usage Environment】Research
【TiDB Version】6.2
【Encountered Problem】Low efficiency when performing Index Lookup Join on primary key
【Problem Phenomenon and Impact】

Execution Plan Issue.txt (13.1 KB)

When simulating Index Lookup Join to observe execution efficiency, I found that the efficiency is particularly low when reading inner table data (intentionally turned off the TiDB cache, and repeated executions took about 2 seconds each time).
As shown in Execution Plan 1 in the attachment, the inner table is associated using the primary key C_CUSTKEY. In the operator TableReader_56 (Probe), the maximum cop_task task takes 363.3ms, max_proc_keys: 636 (through total_process_keys: 6196, total_keys: 6200, it can be inferred that max_keys will not exceed 640), which means scanning 636 keys, and all are memory reads (block: {cache_hit_count: 24023}, it can be confirmed that no physical reads occurred). Why does it take so much time?
For comparison simulation, directly query the primary key C_CUSTKEY of the customer table, simulate full memory read, scan more than 640 records (set to 1000 here), and repeatedly test to find the longest time taken. As shown in Execution Plan 2 in the attachment, it can be seen that scanning total_process_keys: 999 in tikv takes almost no time.
So, I would like to ask, in the case of a small number of records associated with the primary key index, why is it still so slow?

Supplementary trace execution situation:
trace execution time.txt (25.0 KB)

Even the query filtered by index on the outer table is slow: select count(O_CUSTKEY) from orders b where b.O_ORDERDATE =‘1996-12-14’ The actual result is only more than 6k, but it takes more than 1 second.
Specific data can be found in the attachment: orders step-by-step single table slow data.txt (16.0 KB)

Looking at the execution plan, the phenomenon of inefficient SQL is: cache_hit_count is much greater than total_keys. That is, I don’t know why there are so many cache_hit_count. Does this cache_hit_count refer to the number of times rocksdb’s block is read?

| username: 特雷西-迈克-格雷迪 | Original post link

What does rpc time represent, and why is it so high?

| username: 人如其名 | Original post link

The response from TiKV is slow, and the parallel processing combined with waiting results in high latency here, but this is not the main focus this time. The main question is why TiKV scanning is slow.

| username: 人如其名 | Original post link

The key point here is the cache_hit_count value is too high for the query select count(O_CUSTKEY) from orders b where b.O_ORDERDATE ='1996-12-14';. If the primary key of the table is of varchar type, you can reduce the block cache_hit_count by using _tidb_rowid for sorting and back table scanning. The SQL can be rewritten as follows: select count(O_CUSTKEY) from orders where _tidb_rowid in (select _tidb_rowid from orders b where b.O_ORDERDATE ='1996-12-14' order by _tidb_rowid). This ensures that _tidb_rowid is ordered, reducing the number of block accesses to the orders table. However, in this case, the primary key of the orders table is o_custkey, which is a bigint clustered index, so this optimization cannot be applied. For the o_orderdate field index, there should be a rowid for o_custkey, so theoretically, there should be no need to back table scan.

| username: xiaohetao | Original post link

If everything else checks out fine, then you can only optimize the SQL. After optimization, the execution performance has greatly improved as well. :+1:

| username: 人如其名 | Original post link

Actually, I think this can be evaluated by the optimizer, reducing the need for us to do such optimizations ourselves.

| username: xiaohetao | Original post link

The optimizer can only choose the optimal execution plan for the existing SQL, but it cannot change the SQL.

| username: 人如其名 | Original post link

Actually, in this case, it is possible to control without changing the SQL, just by altering the behavior of the rid back table scan block through the access mechanism. This kind of mechanism is generally present in traditional single databases.

| username: xiaohetao | Original post link

Yes, traditional databases also perform table lookups for orders.