Consultation on a TiDB No-Index Query Performance Issue

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 一个 tidb 无索引查询性能问题请教

| username: Qiuchi

[TiDB Usage Environment] Test
[TiDB Version] 6.5

With the same table structure and data volume, Oracle is significantly faster than TiDB for this query. The former takes about two minutes, while the latter never returns. How can this be optimized?

SQL

cr_transaction about 70 million
cr_cust_calc_load_balance about 6 million
cr_customer_account about 6 million

select t.ta_system_code,
       t.ta_account_no,
       t.cust_name,
       t.ta_cfm_serial_no,
       t.transaction_cfm_date,
       t.product_code,
       t.transaction_type
from cr_transaction t
where t.ta_return_code = '0000'
  and t.ta_account_no
    not in (select ta_account_no from cr_cust_calc_load_balance)
  and t.ta_cfm_serial_no not in (
    select t.ta_cfm_serial_no
    from cr_transaction t
             left join cr_customer_account m on t.ta_account_no = m.ta_account_no
    where t.ta_system_code = 'ETFTA'
      and m.ta_account_no is null
);

Oracle Execution Plan

TiDB Execution Plan

id estRows task access object operator info
Projection_14 47682489.35 root dcdb.cr_transaction.ta_system_code, dcdb.cr_transaction.ta_account_no, dcdb.cr_transaction.cust_name, dcdb.cr_transaction.ta_cfm_serial_no, dcdb.cr_transaction.transaction_cfm_date, dcdb.cr_transaction.product_code, dcdb.cr_transaction.transaction_type
└─HashJoin_16 47682489.35 root anti semi join, equal:[eq(dcdb.cr_transaction.ta_cfm_serial_no, dcdb.cr_transaction.ta_cfm_serial_no)]
├─Selection_25(Build) 421507.82 root isnull(dcdb.cr_customer_account.ta_account_no)
│ └─IndexHashJoin_31 526884.78 root left outer join, inner:IndexReader_28, outer key:dcdb.cr_transaction.ta_account_no, inner key:dcdb.cr_customer_account.ta_account_no, equal cond:eq(dcdb.cr_transaction.ta_account_no, dcdb.cr_customer_account.ta_account_no)
│ ├─IndexLookUp_49(Build) 509898.74 root
│ │ ├─IndexRangeScan_47(Build) 509898.74 cop[tikv] table:t, index:IDX_CR_TRANS_CODE_DATE_NO(ta_system_code, transaction_cfm_date, ta_cfm_serial_no) range:[“ETFTA”,“ETFTA”], keep order:false
│ │ └─TableRowIDScan_48(Probe) 509898.74 cop[tikv] table:t keep order:false
│ └─IndexReader_28(Probe) 526884.78 root index:IndexRangeScan_27
│ └─IndexRangeScan_27 526884.78 cop[tikv] table:m, index:PRIMARY(ta_account_no, ta_system_code) range: decided by [eq(dcdb.cr_customer_account.ta_account_no, dcdb.cr_transaction.ta_account_no)], keep order:false
└─HashJoin_17(Probe) 59603111.69 root CARTESIAN anti semi join, other cond:eq(dcdb.cr_transaction.ta_account_no, dcdb.cr_cust_calc_load_balance.ta_account_no)
├─IndexReader_24(Build) 6078924.00 root index:IndexFullScan_23
│ └─IndexFullScan_23 6078924.00 cop[tikv] table:cr_cust_calc_load_balance, index:IDX_LOAD_BALANCE_KEY(calc_unit_id, ta_system_code, ta_account_no) keep order:false
└─TableReader_20(Probe) 74503889.61 root data:Selection_19
└─Selection_19 74503889.61 cop[tikv] eq(dcdb.cr_transaction.ta_return_code, “0000”)
└─TableFullScan_18 74622837.00 cop[tikv] table:t keep order:false
| username: tidb狂热爱好者 | Original post link

Your performance is poor because you have too many ‘not in’ and Cartesian joins. I have fixed the Cartesian joins.

| username: tidb狂热爱好者 | Original post link

Try this:

select distinct t.ta_system_code,
       t.ta_account_no,
       t.cust_name,
       t.ta_cfm_serial_no,
       t.transaction_cfm_date,
       t.product_code,
       t.transaction_type
from cr_transaction t
left join cr_cust_calc_load_balance c on t.ta_account_no = c.ta_account_no
left join cr_transaction t2 on t.ta_cfm_serial_no = t2.ta_cfm_serial_no
left join cr_customer_account m on t.ta_account_no = m.ta_account_no
where t.ta_return_code = '0000'
  and c.ta_account_no is null
  and t2.ta_cfm_serial_no is null
  and t.ta_system_code = 'ETFTA'
  and m.ta_account_no is not null;
| username: tidb狂热爱好者 | Original post link

SELECT t.ta_system_code,
       t.ta_account_no,
       t.cust_name,
       t.ta_cfm_serial_no,
       t.transaction_cfm_date,
       t.product_code,
       t.transaction_type
FROM cr_transaction t
WHERE t.ta_return_code = '0000'
  AND NOT EXISTS (
    SELECT 1
    FROM cr_cust_calc_load_balance l
    WHERE l.ta_account_no = t.ta_account_no
  )
  AND NOT EXISTS (
    SELECT 1
    FROM cr_transaction t2
    INNER JOIN cr_customer_account m ON t2.ta_account_no = m.ta_account_no
    WHERE t2.ta_system_code = 'ETFTA'
      AND m.ta_account_no IS NULL
      AND t2.ta_cfm_serial_no = t.ta_cfm_serial_no
  )
| username: Kongdom | Original post link

Changing it to “not exists” might be better.

Isn’t the logic of the inner join in the second “not exists” incorrect? The original text seems to want to find records that are in t2 but not in m, while an inner join should find records that are in both t2 and m, right?

| username: liuis | Original post link

If you don’t know how to optimize, just optimize the query statement, haha.

| username: Qiuchi | Original post link

Yes, it indeed cannot use inner join, but I tried using exists before, and the execution plan was the same as before. What I’m actually curious about is that the execution plans for this query in TiDB and Oracle seem to be the same? But why is there such a big difference in speed?

| username: Kongdom | Original post link

My understanding is that TiDB requires at least one more aggregation step, where the query results from each node are returned to the owner node, which is not necessary for a single-node database. The stronger the computational power of each node, the stronger the distributed database.

| username: tidb菜鸟一只 | Original post link

I see that the indexes chosen by the two execution plans are different, which also has a significant impact.

| username: 人如其名 | Original post link

I can’t understand why there is a CARTESIAN anti semi join here. Generally speaking, a Cartesian product only occurs when there are no join conditions or when the join conditions are non-equivalent. There are also corresponding examples in the source code:

But I couldn’t reproduce it myself:

It’s version 6.5.

| username: Qiuchi | Original post link

I indeed don’t have any associated conditions, just like the documentation. It’s strange why your example isn’t like this. I don’t understand the difference between “other cond:” and “cond:” in this context.

| username: 人如其名 | Original post link

I tested it, and it seems to be related to whether the join condition in the subquery can be null.

The reason it becomes CARTESIAN is probably related to this PR: plan/executor: make semi joins null and empty aware by eurekaka · Pull Request #9051 · pingcap/tidb · GitHub

The CARTESIAN Cartesian product mark in the operation info here is probably related to the definition of cr_cust_calc_load_balance.ta_account_no allowing null. However, it is still unclear whether this affects performance. You can capture the execution information of the connection next time you execute it: explain for connection conn_id.

| username: 人如其名 | Original post link

After testing, the issue seems to be related to the Cartesian product here, and the root cause is likely related to the join fields in the anti-semi join being nullable:

mysql> show create table customer_bak;
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table        | Create Table                                                                                                                                                                                                                                                                                                                                                                            |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| customer_bak | CREATE TABLE `customer_bak` (
  `C_CUSTKEY` bigint(20) DEFAULT NULL,
  `C_NAME` varchar(25) NOT NULL,
  `C_ADDRESS` varchar(40) NOT NULL,
  `C_NATIONKEY` bigint(20) NOT NULL,
  `C_PHONE` char(15) NOT NULL,
  `C_ACCTBAL` decimal(15,2) NOT NULL,
  `C_MKTSEGMENT` char(10) NOT NULL,
  `C_COMMENT` varchar(117) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

mysql> show create table customer_bak1;
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table         | Create Table                                                                                                                                                                                                                                                                                                                                                                         |
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| customer_bak1 | CREATE TABLE `customer_bak1` (
  `C_CUSTKEY` bigint(20) NOT NULL,
  `C_NAME` varchar(25) NOT NULL,
  `C_ADDRESS` varchar(40) NOT NULL,
  `C_NATIONKEY` bigint(20) NOT NULL,
  `C_PHONE` char(15) NOT NULL,
  `C_ACCTBAL` decimal(15,2) NOT NULL,
  `C_MKTSEGMENT` char(10) NOT NULL,
  `C_COMMENT` varchar(117) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

mysql> 

mysql> explain analyze select * from orders where o_custkey not in (select c_custkey from customer_bak1);
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
| id                          | estRows     | actRows  | task      | access object       | execution info                                                                                                                                                                                                                                                                                                                                                        | operator info                                                                   | memory   | disk    |
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
| HashJoin_8                  | 29862297.60 | 36607771 | root      |                     | time:15.5s, loops:35753, build_hash_table:{total:124.3ms, fetch:110.8ms, build:13.5ms}, probe:{concurrency:5, total:1m17.8s, max:15.6s, probe:13.5s, fetch:1m4.3s}                                                                                                                                                                                                    | anti semi join, equal:[eq(tpch.orders.o_custkey, tpch.customer_bak1.c_custkey)] | 6.80 MB  | 0 Bytes |
| ├─TableReader_12(Build)     | 100000.00   | 100000   | root      |                     | time:111.3ms, loops:100, cop_task: {num: 10, max: 17.2ms, min: 9.71ms, avg: 12.1ms, p95: 17.2ms, max_proc_keys: 33760, p95_proc_keys: 33760, tot_proc: 109ms, rpc_num: 10, rpc_time: 120.4ms, copr_cache: disabled, build_task_duration: 4.63µs, max_distsql_concurrency: 1}                                                                                          | data:TableFullScan_11                                                           | 482.5 KB | N/A     |
| │ └─TableFullScan_11        | 100000.00   | 100000   | cop[tikv] | table:customer_bak1 | tikv_task:{proc max:17ms, min:9ms, avg: 11.3ms, p80:15ms, p95:17ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 100000, total_process_keys_size: 21056033, total_keys: 100010, get_snapshot_time: 93.1µs, rocksdb: {delete_skipped_count: 843488, key_skipped_count: 943488, block: {cache_hit_count: 6704}}}                                             | keep order:false, stats:pseudo                                                  | N/A      | N/A     |
| └─TableReader_10(Probe)     | 37327872.00 | 37438464 | root      |                     | time:12s, loops:36662, cop_task: {num: 1259, max: 539.4ms, min: 825.2µs, avg: 170.5ms, p95: 398.9ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 2m41.2s, tot_wait: 36.3s, rpc_num: 1259, rpc_time: 3m34.6s, copr_cache: disabled, build_task_duration: 57µs, max_distsql_concurrency: 15}                                                                  | data:TableFullScan_9                                                            | 25.4 MB  | N/A     |
|   └─TableFullScan_9         | 37327872.00 | 37438464 | cop[tikv] | table:orders        | tikv_task:{proc max:490ms, min:0s, avg: 118.3ms, p80:211ms, p95:300ms, iters:41559, tasks:1259}, scan_detail: {total_process_keys: 37438464, total_process_keys_size: 5686354813, total_keys: 37439723, get_snapshot_time: 28.3ms, rocksdb: {key_skipped_count: 37438464, block: {cache_hit_count: 64349, read_count: 136302, read_byte: 1.79 GB, read_time: 6.48s}}} | keep order:false                                                                | N/A      | N/A     |
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
5 rows in set (15.56 sec)


mysql> explain for connection 6313024131759604271;
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
| id                          | estRows     | actRows | task      | access object      | execution info                                                                                                                                                                                                                                                                                                                                   | operator info                                                                               | memory | disk |
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
| HashJoin_8                  | 29862297.60 | 229376  | root      |                    | time:11m53.2s, loops:224                                                                                                                                                                                                                                                                                                                         | CARTESIAN anti semi join, other cond:eq(tpch.orders.o_custkey, tpch.customer_bak.c_custkey) | N/A    | N/A  |
| ├─TableReader_12(Build)     | 100000.00   | 100000  | root      |                    | time:60.9ms, loops:100                                                                                                                                                                                                                                                                                                                           | data:TableFullScan_11                                                                       | N/A    | N/A  |
| │ └─TableFullScan_11        | 100000.00   | 100000  | cop[tikv] | table:customer_bak | tikv_task:{proc max:14ms, min:0s, avg: 5.1ms, p80:12ms, p95:14ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 100000, total_process_keys_size: 21053008, total_keys: 100010, get_snapshot_time: 1ms, rocksdb: {key_skipped_count: 100000, block: {cache_hit_count: 161}}}                                                            | keep order:false                                                                            | N/A    | N/A  |
| └─TableReader_10(Probe)     | 37327872.00 | 239840  | root      |                    | time:19.1ms, loops:237                                                                                                                                                                                                                                                                                                                           | data:TableFullScan_9                                                                        | N/A    | N/A  |
|   └─TableFullScan_9         | 37327872.00 | 241888  | cop[tikv] | table:orders       | tikv_task:{proc max:14ms, min:0s, avg: 2.56ms, p80:4ms, p95:8ms, iters:601, tasks:97}, scan_detail: {total_process_keys: 241888, total_process_keys_size: 36738824, total_keys: 241985, get_snapshot_time: 34.6ms, rocksdb: {key_skipped_count: 241888, block: {cache_hit_count: 1779, read_count: 313, read_byte: 2.85 MB, read_time: 2.05ms}}} | keep order:false                                                                            | N/A    | N/A  |
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
5 rows in set (0.02 sec)


From the above, we can see that for the customer_bak1 table (non-null join condition), the result is returned in 15 seconds. However, for the customer_bak table (nullable join condition), it has been running for 11 minutes and still hasn’t finished, likely with no end in sight.

Therefore, it seems that TiDB has performance issues with non-null fields in anti-semi joins. To further analyze the specific performance issues, we would need to delve into the key paths in the pprof and examine the code.

| username: Qiuchi | Original post link

It’s a bit strange. I saw that this PR is supposed to distinguish between null and non-null operands for the IN operation, but the outer table of this hash join is cr_transaction, and the ta_account_no column in this table allows nulls, while the inner table load_balance does not allow nulls. However, since you were able to reproduce it, it should indeed be related. I’ll take another look at it tomorrow after running it. Thanks.

| username: tiancaiamao | Original post link

Whether or not there is a null value can greatly affect (anti) semi joins, because the eq condition of a join is normally true or false. However, with the presence of null, the eq judgment can have three states: true, false, or null. When null is involved, special handling is required, leading to a final query plan that uses a Cartesian product + filter, resulting in poorer performance.

In TiDB 6.3, there were some optimizations made in this area, implementing null-aware anti join: Support Null-aware Anti Join · Issue #37525 · pingcap/tidb · GitHub

Regarding the topic of null-aware anti join, Oracle seems to have a patent on this: US7676450B2 - Null aware anti-join - Google Patents

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.