Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: not in 执行计划疑问

TiDB version: v6.5.3
Hello teachers, I have a question about the execution plan for not in
?
mysql> show create table t3;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t3 | CREATE TABLE `t3` (
`id` int(11) NOT NULL,
`name` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
KEY `index_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table t4;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t4 | CREATE TABLE `t4` (
`id` int(11) NOT NULL,
`name` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
KEY `index_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> explain select * from t3 where t3.name not in (select name from t4 where name is not null) and t3.name is not null;
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
| HashJoin_9 | 1.60 | root | | CARTESIAN anti semi join, other cond:eq(nulljoin.t3.name, nulljoin.t4.name) |
| ├─IndexReader_13(Build) | 1.00 | root | | index:IndexFullScan_12 |
| │ └─IndexFullScan_12 | 1.00 | cop[tikv] | table:t4, index:index_name(name) | keep order:false, stats:pseudo |
| └─IndexReader_11(Probe) | 2.00 | root | | index:IndexFullScan_10 |
| └─IndexFullScan_10 | 2.00 | cop[tikv] | table:t3, index:index_name(name) | keep order:false, stats:pseudo |
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
5 rows in set (0.00 sec)
mysql> explain analyze select * from t3 where name not in (select name from t4 where name is not null) and name is not null;
+-----------------------------+---------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+-----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-----------------------------+---------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+-----------+---------+
| HashJoin_9 | 1.60 | 1 | root | | time:975.1µs, loops:2, build_hash_table:{total:784.7µs, fetch:782µs, build:2.67µs}, probe:{concurrency:5, total:4.15ms, max:833.7µs, probe:13.1µs, fetch:4.14ms} | CARTESIAN anti semi join, other cond:eq(nulljoin.t3.name, nulljoin.t4.name) | 17.5 KB | 0 Bytes |
| ├─IndexReader_13(Build) | 1.00 | 1 | root | | time:767.3µs, loops:2, cop_task: {num: 1, max: 760.9µs, proc_keys: 1, rpc_num: 1, rpc_time: 749.9µs, copr_cache: disabled, distsql_concurrency: 15} | index:IndexFullScan_12 | 232 Bytes | N/A |
| │ └─IndexFullScan_12 | 1.00 | 1 | cop[tikv] | table:t4, index:index_name(name) | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_process_keys_size: 56, total_keys: 2, get_snapshot_time: 4.04µs, rocksdb: {key_skipped_count: 1, block: {cache_hit_count: 6}}} | keep order:false, stats:pseudo | N/A | N/A |
| └─IndexReader_11(Probe) | 2.00 | 2 | root | | time:815.5µs, loops:2, cop_task: {num: 1, max: 878.5µs, proc_keys: 2, rpc_num: 1, rpc_time: 859µs, copr_cache: disabled, distsql_concurrency: 15} | index:IndexFullScan_10 | 266 Bytes | N/A |
| └─IndexFullScan_10 | 2.00 | 2 | cop[tikv] | table:t3, index:index_name(name) | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 2, total_process_keys_size: 112, total_keys: 3, get_snapshot_time: 19.8µs, rocksdb: {key_skipped_count: 2, block: {cache_hit_count: 6}}} | keep order:false, stats:pseudo | N/A | N/A |
+-----------------------------+---------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+-----------+---------+
5 rows in set (0.00 sec)
I have already filtered out fields that may contain null values with is not null
. Why does the execution plan still produce a Cartesian product? My personal thought is that after filtering out null values, it should be able to use a hash join operator since there are no null values affecting it anymore. Could the teachers please provide some guidance?