After testing, the issue seems to be related to the Cartesian product here, and the root cause is likely related to the join fields in the anti-semi join being nullable:
mysql> show create table customer_bak;
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| customer_bak | CREATE TABLE `customer_bak` (
`C_CUSTKEY` bigint(20) DEFAULT NULL,
`C_NAME` varchar(25) NOT NULL,
`C_ADDRESS` varchar(40) NOT NULL,
`C_NATIONKEY` bigint(20) NOT NULL,
`C_PHONE` char(15) NOT NULL,
`C_ACCTBAL` decimal(15,2) NOT NULL,
`C_MKTSEGMENT` char(10) NOT NULL,
`C_COMMENT` varchar(117) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
mysql> show create table customer_bak1;
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| customer_bak1 | CREATE TABLE `customer_bak1` (
`C_CUSTKEY` bigint(20) NOT NULL,
`C_NAME` varchar(25) NOT NULL,
`C_ADDRESS` varchar(40) NOT NULL,
`C_NATIONKEY` bigint(20) NOT NULL,
`C_PHONE` char(15) NOT NULL,
`C_ACCTBAL` decimal(15,2) NOT NULL,
`C_MKTSEGMENT` char(10) NOT NULL,
`C_COMMENT` varchar(117) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)
mysql>
mysql> explain analyze select * from orders where o_custkey not in (select c_custkey from customer_bak1);
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
| HashJoin_8 | 29862297.60 | 36607771 | root | | time:15.5s, loops:35753, build_hash_table:{total:124.3ms, fetch:110.8ms, build:13.5ms}, probe:{concurrency:5, total:1m17.8s, max:15.6s, probe:13.5s, fetch:1m4.3s} | anti semi join, equal:[eq(tpch.orders.o_custkey, tpch.customer_bak1.c_custkey)] | 6.80 MB | 0 Bytes |
| ├─TableReader_12(Build) | 100000.00 | 100000 | root | | time:111.3ms, loops:100, cop_task: {num: 10, max: 17.2ms, min: 9.71ms, avg: 12.1ms, p95: 17.2ms, max_proc_keys: 33760, p95_proc_keys: 33760, tot_proc: 109ms, rpc_num: 10, rpc_time: 120.4ms, copr_cache: disabled, build_task_duration: 4.63µs, max_distsql_concurrency: 1} | data:TableFullScan_11 | 482.5 KB | N/A |
| │ └─TableFullScan_11 | 100000.00 | 100000 | cop[tikv] | table:customer_bak1 | tikv_task:{proc max:17ms, min:9ms, avg: 11.3ms, p80:15ms, p95:17ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 100000, total_process_keys_size: 21056033, total_keys: 100010, get_snapshot_time: 93.1µs, rocksdb: {delete_skipped_count: 843488, key_skipped_count: 943488, block: {cache_hit_count: 6704}}} | keep order:false, stats:pseudo | N/A | N/A |
| └─TableReader_10(Probe) | 37327872.00 | 37438464 | root | | time:12s, loops:36662, cop_task: {num: 1259, max: 539.4ms, min: 825.2µs, avg: 170.5ms, p95: 398.9ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 2m41.2s, tot_wait: 36.3s, rpc_num: 1259, rpc_time: 3m34.6s, copr_cache: disabled, build_task_duration: 57µs, max_distsql_concurrency: 15} | data:TableFullScan_9 | 25.4 MB | N/A |
| └─TableFullScan_9 | 37327872.00 | 37438464 | cop[tikv] | table:orders | tikv_task:{proc max:490ms, min:0s, avg: 118.3ms, p80:211ms, p95:300ms, iters:41559, tasks:1259}, scan_detail: {total_process_keys: 37438464, total_process_keys_size: 5686354813, total_keys: 37439723, get_snapshot_time: 28.3ms, rocksdb: {key_skipped_count: 37438464, block: {cache_hit_count: 64349, read_count: 136302, read_byte: 1.79 GB, read_time: 6.48s}}} | keep order:false | N/A | N/A |
+-----------------------------+-------------+----------+-----------+---------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+----------+---------+
5 rows in set (15.56 sec)
mysql> explain for connection 6313024131759604271;
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
| HashJoin_8 | 29862297.60 | 229376 | root | | time:11m53.2s, loops:224 | CARTESIAN anti semi join, other cond:eq(tpch.orders.o_custkey, tpch.customer_bak.c_custkey) | N/A | N/A |
| ├─TableReader_12(Build) | 100000.00 | 100000 | root | | time:60.9ms, loops:100 | data:TableFullScan_11 | N/A | N/A |
| │ └─TableFullScan_11 | 100000.00 | 100000 | cop[tikv] | table:customer_bak | tikv_task:{proc max:14ms, min:0s, avg: 5.1ms, p80:12ms, p95:14ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 100000, total_process_keys_size: 21053008, total_keys: 100010, get_snapshot_time: 1ms, rocksdb: {key_skipped_count: 100000, block: {cache_hit_count: 161}}} | keep order:false | N/A | N/A |
| └─TableReader_10(Probe) | 37327872.00 | 239840 | root | | time:19.1ms, loops:237 | data:TableFullScan_9 | N/A | N/A |
| └─TableFullScan_9 | 37327872.00 | 241888 | cop[tikv] | table:orders | tikv_task:{proc max:14ms, min:0s, avg: 2.56ms, p80:4ms, p95:8ms, iters:601, tasks:97}, scan_detail: {total_process_keys: 241888, total_process_keys_size: 36738824, total_keys: 241985, get_snapshot_time: 34.6ms, rocksdb: {key_skipped_count: 241888, block: {cache_hit_count: 1779, read_count: 313, read_byte: 2.85 MB, read_time: 2.05ms}}} | keep order:false | N/A | N/A |
+-----------------------------+-------------+---------+-----------+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------+--------+------+
5 rows in set (0.02 sec)
From the above, we can see that for the customer_bak1
table (non-null join condition), the result is returned in 15 seconds. However, for the customer_bak
table (nullable join condition), it has been running for 11 minutes and still hasn’t finished, likely with no end in sight.
Therefore, it seems that TiDB has performance issues with non-null fields in anti-semi joins. To further analyze the specific performance issues, we would need to delve into the key paths in the pprof and examine the code.