Analysis of Execution Calculation Bug in Semi Join

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: semi join 的执行计算解析bug ?

| username: Hacker_r2TAbGbZ

Bug Report
A relatively simple statement:

SELECT order_id FROM fin_dws.dws_fin_east_multi_year_acv_lcy
WHERE data_source = 'EAST'

Directly reports an error, Unknown column ‘order_id’ in ‘field list’

SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         WHERE order_id NOT IN (
             SELECT order_id FROM fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         )
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
           LIMIT 10

Execution plan is as follows:

id,estRows,actRows,task,access object,execution info,operator info,memory,disk
Projection_11,10.00,0,root,,"time:5.31s, loops:1, RU:11.888240, Concurrency:OFF","fin_dws.dws_fin2_global_acv_1d.account_id, fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, fin_dws.dws_fin2_global_acv_1d.contract_no, fin_dws.dws_fin2_global_acv_1d.lcy_total_acv",9.14 KB,N/A
└─Limit_14,10.00,0,root,,"time:5.31s, loops:1","offset:0, count:10",N/A,N/A
  └─HashJoin_15,10.00,0,root,,"time:5.31s, loops:1, build_hash_table:{total:11ms, fetch:9.77ms, build:1.23ms}, probe:{concurrency:16, total:1m9.7s, max:5.31s, probe:1m8.8s, fetch:899.3ms}",CARTESIAN anti semi join,438.4 KB,0 Bytes
    ├─TableReader_21(Build),12426.00,12426,root,,"time:10.1ms, loops:14, cop_task: {num: 7, max: 2.66ms, min: 1.23ms, avg: 1.55ms, p95: 2.66ms, tot_proc: 11µs, tot_wait: 1.73ms, rpc_num: 7, rpc_time: 10.7ms, copr_cache_hit_ratio: 1.00, build_task_duration: 6µs, max_distsql_concurrency: 1}",data:Selection_20,95.9 KB,N/A
    │ └─Selection_20,12426.00,12426,cop[tikv],,"tikv_task:{proc max:20ms, min:0s, avg: 7.43ms, p80:12ms, p95:20ms, iters:39, tasks:7}, scan_detail: {get_snapshot_time: 1.38ms, rocksdb: {block: {}}}","eq(fin_dws.dws_fin_east_multi_year_acv_lcy.data_source, ""EAST"")",N/A,N/A
    │   └─TableFullScan_19,12426.00,12426,cop[tikv],table:dws_fin_east_multi_year_acv_lcy,"tikv_task:{proc max:20ms, min:0s, avg: 7.43ms, p80:12ms, p95:20ms, iters:39, tasks:7}",keep order:false,N/A,N/A
    └─TableReader_18(Probe),12.50,22239,root,,"time:62.8ms, loops:23, cop_task: {num: 18, max: 40.9ms, min: 502.5µs, avg: 8.66ms, p95: 40.9ms, tot_proc: 28.7µs, tot_wait: 4.89ms, rpc_num: 18, rpc_time: 155.3ms, copr_cache_hit_ratio: 1.00, build_task_duration: 17.5µs, max_distsql_concurrency: 3}",data:Selection_17,830.3 KB,N/A
      └─Selection_17,12.50,22239,cop[tikv],,"tikv_task:{proc max:120ms, min:0s, avg: 35.6ms, p80:68ms, p95:120ms, iters:141, tasks:18}, scan_detail: {get_snapshot_time: 4.05ms, rocksdb: {block: {}}}","eq(fin_dws.dws_fin2_global_acv_1d.type, ""EAST""), or(eq(cast(fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, double BINARY), 2023), eq(cast(fin_dws.dws_fin2_global_acv_1d.acv_gsr_year, double BINARY), 2024))",N/A,N/A
        └─TableFullScan_16,73.55,70424,cop[tikv],table:t,"tikv_task:{proc max:112ms, min:0s, avg: 34.2ms, p80:64ms, p95:112ms, iters:141, tasks:18}",keep order:false,N/A,N/A

So the CARTESIAN anti semi join here has a problem, it is not recognized.

Change to another way of writing:

SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         LEFT JOIN (
             SELECT order_id FROM fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         ) multi
            ON t.order_id = multi.order_id
         WHERE multi.order_id IS NULL
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
           LIMIT 10

This way there is no error.

Table DDL files:
dws_fin_east_multi_year_acv_lcy.ddl.sql (2.7 KB)
dws_fin2_global_acv_1d.ddl.sql (7.9 KB)

【TiDB Version】
7.5.1
【Impact of the Bug】
Affects the accuracy of statistical results.
【Possible Steps to Reproduce the Problem】
Simple SQL can reproduce:

-- NOT IN subquery
WITH t1 AS (
    SELECT 1 AS n
),

t2 AS (
  SELECT 2 AS b 
  )
  
SELECT *
FROM t1
WHERE n NOT IN (SELECT n FROM t2);

And IN subquery

WITH t1 AS (
    SELECT 1 AS n
),

t2 AS (
  SELECT 2 AS b 
  )
  
SELECT *
FROM t1
WHERE n IN (SELECT n FROM t2)

【Observed Unexpected Behavior】
The statement can be executed smoothly without error, producing unexpected results.
【Expected Behavior】
Syntax error
【Related Components and Specific Versions】
tidb
【Other Background Information or Screenshots】
Such as cluster topology, system and kernel version, application app information, etc.; if the problem is related to SQL, please provide SQL statements and related table Schema information; if there are key errors in the node logs, please provide relevant node log content or files; if some business-sensitive information is inconvenient to provide, please leave contact information, and we will communicate with you privately.

| username: WalterWj | Original post link

What is the table structure?

| username: Hacker_r2TAbGbZ | Original post link

Thanks~~ Okay, I’ve updated it in the post @WalterWj

| username: WalterWj | Original post link

Your table does not have an order_id column.

| username: WalterWj | Original post link

I want to know what the problem is :thinking:. Is it that you think the execution plan has an issue, or is the result incorrect, or is there an SQL error?

| username: forever | Original post link

It should be stated that executing it alone will result in an error for each column, but after incorporating it into the subquery, the error no longer occurs.

| username: WalterWj | Original post link

It should report an error, but this might be caused by Deferred Name Resolution. The database might not immediately check for the existence of columns in the subquery, especially if the subquery is not actually executed (due to optimization or other reasons).

| username: WalterWj | Original post link

I tested it in MySQL and there were no errors.

| username: tidb菜鸟一只 | Original post link

This has nothing to do with TiDB or MySQL; it’s because your sql_mode is set to NO_ENGINE_SUBSTITUTION in non-strict mode.
SHOW GLOBAL VARIABLES LIKE ‘sql_mode’;
Remove NO_ENGINE_SUBSTITUTION and reset sql_mode, and it will report an error.
SET sql_mode=‘ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER’;

| username: onlyacat | Original post link

I don’t quite understand what the issue is… I tried the latest version of TiDB and the SQL at the bottom is consistent with MySQL.

| username: Hacker_r2TAbGbZ | Original post link

Tested this method, and TiDB did not report an error …


SET sql_mode='ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO';
SELECT account_id,
                acv_gsr_year AS gs_year,
                contract_no,
                lcy_total_acv AS acv2
         FROM fin_dws.dws_fin2_global_acv_1d t
         WHERE order_id NOT IN (
             SELECT order_id FROM  fin_dws.dws_fin_east_multi_year_acv_lcy
             WHERE data_source = 'EAST'
         )
           AND acv_gsr_year IN (2023, 2024)
           AND t.type = 'EAST'
           LIMIT 10
| username: tidb菜鸟一只 | Original post link

Sorry, the previous reply was incorrect. The order_id in the subquery was treated as the order_id field of the outer table, so no error was reported. You can change it like this to report an error:

SELECT account_id,
       acv_gsr_year AS gs_year,
       contract_no,
       lcy_total_acv AS acv2
FROM fin_dws.dws_fin2_global_acv_1d t
WHERE order_id NOT IN (
    SELECT a.order_id FROM fin_dws.dws_fin_east_multi_year_acv_lcy a
    WHERE data_source = 'EAST'
)
  AND acv_gsr_year IN (2023, 2024)
  AND t.type = 'EAST'
  LIMIT 10
| username: Hacker_r2TAbGbZ | Original post link

Hmm. It’s like this, this is very strange :sweat_smile: It actually parsed to the outer layer.

| username: Hacker_r2TAbGbZ | Original post link

Thank you, handsome. It’s my lack of understanding of SQL syntax parsing. If a field cannot be found, it will look for it in the outer layer. Writing this kind of code requires caution…

| username: Hacker_r2TAbGbZ | Original post link

Thank you very much.

| username: zhaokede | Original post link

It is best to add aliases for multiple tables, and use alias.field for field names.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.