Does "not in" not support index join?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: not in 不支持走index join嘛

| username: Raymond

[Test Environment for TiDB]
Version: v6.5.3
Dear teachers, is it true that the “not in” anti join statement does not support index join?

mysql> show create table t3;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                     |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t3    | CREATE TABLE `t3` (
  `id` int(11) NOT NULL,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `index_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table t4;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                                     |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| t4    | CREATE TABLE `t4` (
  `id` int(11) NOT NULL,
  `name` varchar(10) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `index_name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> explain select /*+ INL_JOIN(t3) */  * from t3 where t3.name  not  in (select t4.name from t4);
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
| id                          | estRows | task      | access object                    | operator info                                                               |
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
| HashJoin_8                  | 1.60    | root      |                                  | CARTESIAN anti semi join, other cond:eq(nulljoin.t3.name, nulljoin.t4.name) |
| ├─IndexReader_16(Build)     | 1.00    | root      |                                  | index:IndexFullScan_15                                                      |
| │ └─IndexFullScan_15        | 1.00    | cop[tikv] | table:t4, index:index_name(name) | keep order:false, stats:pseudo                                              |
| └─TableReader_10(Probe)     | 2.00    | root      |                                  | data:TableFullScan_9                                                        |
|   └─TableFullScan_9         | 2.00    | cop[tikv] | table:t3                         | keep order:false, stats:pseudo                                              |
+-----------------------------+---------+-----------+----------------------------------+-----------------------------------------------------------------------------+
5 rows in set, 1 warning (0.00 sec)

mysql> explain select /*+ INL_JOIN(t3) */  * from t3 where t3.name    in (select t4.name from t4);
+--------------------------------+---------+-----------+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+
| id                             | estRows | task      | access object                    | operator info                                                                                                                               |
+--------------------------------+---------+-----------+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_13                   | 1.25    | root      |                                  | inner join, inner:IndexReader_12, outer key:nulljoin.t4.name, inner key:nulljoin.t3.name, equal cond:eq(nulljoin.t4.name, nulljoin.t3.name) |
| ├─StreamAgg_32(Build)          | 1.00    | root      |                                  | group by:nulljoin.t4.name, funcs:firstrow(nulljoin.t4.name)->nulljoin.t4.name                                                               |
| │ └─IndexReader_33             | 1.00    | root      |                                  | index:StreamAgg_24                                                                                                                          |
| │   └─StreamAgg_24             | 1.00    | cop[tikv] |                                  | group by:nulljoin.t4.name,                                                                                                                  |
| │     └─IndexFullScan_31       | 1.00    | cop[tikv] | table:t4, index:index_name(name) | keep order:true, stats:pseudo                                                                                                               |
| └─IndexReader_12(Probe)        | 1.25    | root      |                                  | index:Selection_11                                                                                                                          |
|   └─Selection_11               | 1.25    | cop[tikv] |                                  | not(isnull(nulljoin.t3.name))                                                                                                               |
|     └─IndexRangeScan_10        | 1.25    | cop[tikv] | table:t3, index:index_name(name) | range: decided by [eq(nulljoin.t3.name, nulljoin.t4.name)], keep order:false, stats:pseudo                                                  |
+--------------------------------+---------+-----------+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)

| username: TiDBer_JUi6UvZm | Original post link

It seems that there is no problem with the query plan.

| username: YuchongXU | Original post link

Join should be considered a connection algorithm.

| username: paulli | Original post link

Added an agg, it seems that index join is not supported.

| username: TiDBer_iLonNMYE | Original post link

Rewrite it as a table join and try it. From T3 left join T4 on T3.NAME = T4.NAME and T3.NAME is null

| username: changpeng75 | Original post link

You can consider using the EXCEPT set operation to rewrite it, as TiDB does not support MINUS.

| username: xiaoqiao | Original post link

Generally, “not in” doesn’t work, try using “not exists” / “left join” instead.

| username: xfworld | Original post link

Try using left join or right join.

| username: 人如其名 | Original post link

The anti-semi join only supports hash join, and it is speculated that the left outer anti-semi join supports index join. You can refer to the left outer anti-semi join in this article to test the hint for using the index: https://zhuanlan.zhihu.com/p/671700446

| username: zhaokede | Original post link

Not supported, switch to something else.

| username: 小龙虾爱大龙虾 | Original post link

This is a post from the TiDB community. Why not just share the community link? :joy_cat:

| username: TiDBer_RjzUpGDL | Original post link

Try using a left join.

| username: 人如其名 | Original post link

Yes, it’s Mr. Mao’s article, but I saw it on Zhihu before, so I searched for it directly on Zhihu.

| username: 健康的腰间盘 | Original post link

Subqueries are not as good as joins.

| username: 小于同学 | Original post link

Try using a left join.

| username: zhh_912 | Original post link

TiDB does not support minus.