The execution plan generation time for TPCH Q11 statement is too long

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TPCH的Q11语句执行计划生成时间过长

| username: 人如其名

[TiDB Usage Environment] Testing
[TiDB Version] 6.5.0

When testing the TPCH Q11 statement, it was found that the statement execution plan generation time was too long. The statement is as follows:

select /*+ TPCH_Q11 */
        ps_partkey,
        sum(ps_supplycost * ps_availqty) as value
from
        partsupp,
        supplier,
        nation
where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'GERMANY'
group by
        ps_partkey having
                sum(ps_supplycost * ps_availqty) > (
                        select
                                sum(ps_supplycost * ps_availqty) * 0.0000100000
                        from
                                partsupp,
                                supplier,
                                nation
                        where
                                ps_suppkey = s_suppkey
                                and s_nationkey = n_nationkey
                                and n_name = 'GERMANY'
                )
order by
        value desc;

The specific execution plan is as follows:

mysql> explain select 
    ->         ps_partkey,
    ->         sum(ps_supplycost * ps_availqty) as value
    -> from
    ->         partsupp,
    ->         supplier,
    ->         nation
    -> where
    ->         ps_suppkey = s_suppkey
    ->         and s_nationkey = n_nationkey
    ->         and n_name = 'GERMANY'
    -> group by
    ->         ps_partkey having
    ->                 sum(ps_supplycost * ps_availqty) > (
    ->                         select
    ->                                 sum(ps_supplycost * ps_availqty) * 0.0000100000
    ->                         from
    ->                                 partsupp,
    ->                                 supplier,
    ->                                 nation
    ->                         where
    ->                                 ps_suppkey = s_suppkey
    ->                                 and s_nationkey = n_nationkey
    ->                                 and n_name = 'GERMANY'
    ->                 )
    -> order by
    ->         value desc;
+--------------------------------------------+------------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                         | estRows    | task      | access object  | operator info                                                                                                                                              |
+--------------------------------------------+------------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Projection_63                              | 257648.92  | root      |                | tpch1.partsupp.ps_partkey, Column#33                                                                                                                       |
| └─Sort_64                                  | 257648.92  | root      |                | Column#33:desc                                                                                                                                             |
|   └─Selection_66                           | 257648.92  | root      |                | gt(Column#33, 8102913.765246800000)                                                                                                                        |
|     └─HashAgg_67                           | 322061.15  | root      |                | group by:Column#58, funcs:sum(Column#56)->Column#33, funcs:firstrow(Column#57)->tpch1.partsupp.ps_partkey                                                  |
|       └─Projection_94                      | 322061.15  | root      |                | mul(tpch1.partsupp.ps_supplycost, cast(tpch1.partsupp.ps_availqty, decimal(20,0) BINARY))->Column#56, tpch1.partsupp.ps_partkey, tpch1.partsupp.ps_partkey |
|         └─HashJoin_71                      | 322061.15  | root      |                | inner join, equal:[eq(tpch1.supplier.s_suppkey, tpch1.partsupp.ps_suppkey)]                                                                                |
|           ├─HashJoin_84(Build)             | 4000.00    | root      |                | inner join, equal:[eq(tpch1.nation.n_nationkey, tpch1.supplier.s_nationkey)]                                                                               |
|           │ ├─TableReader_89(Build)        | 1.00       | root      |                | data:Selection_88                                                                                                                                          |
|           │ │ └─Selection_88               | 1.00       | cop[tikv] |                | eq(tpch1.nation.n_name, "GERMANY")                                                                                                                         |
|           │ │   └─TableFullScan_87         | 25.00      | cop[tikv] | table:nation   | keep order:false                                                                                                                                           |
|           │ └─TableReader_86(Probe)        | 100000.00  | root      |                | data:TableFullScan_85                                                                                                                                      |
|           │   └─TableFullScan_85           | 100000.00  | cop[tikv] | table:supplier | keep order:false                                                                                                                                           |
|           └─TableReader_91(Probe)          | 8000000.00 | root      |                | data:TableFullScan_90                                                                                                                                      |
|             └─TableFullScan_90             | 8000000.00 | cop[tikv] | table:partsupp | keep order:false                                                                                                                                           |
+--------------------------------------------+------------+-----------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+
14 rows in set (1.14 sec)

The overall execution time of the statement is not long (including the time to generate the execution plan, it is only 2.x seconds):

mysql> explain analyze select 
    ->         ps_partkey,
    ->         sum(ps_supplycost * ps_availqty) as value
    -> from
    ->         partsupp,
    ->         supplier,
    ->         nation
    -> where
    ->         ps_suppkey = s_suppkey
    ->         and s_nationkey = n_nationkey
    ->         and n_name = 'GERMANY'
    -> group by
    ->         ps_partkey having
    ->                 sum(ps_supplycost * ps_availqty) > (
    ->                         select
    ->                                 sum(ps_supplycost * ps_availqty) * 0.0000100000
    ->                         from
    ->                                 partsupp,
    ->                                 supplier,
    ->                                 nation
    ->                         where
    ->                                 ps_suppkey = s_suppkey
    ->                                 and s_nationkey = n_nationkey
    ->                                 and n_name = 'GERMANY'
    ->                 )
    -> order by
    ->         value desc;
+--------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| id                                         | estRows    | actRows | task      | access object  | execution info                                                                                                                                                                                                                                                                                     | operator info                                                                                                                                              | memory    | disk    |
+--------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| Projection_63                              | 257648.92  | 8685    | root      |                | time:1.27s, loops:10, Concurrency:3                                                                                                                                                                                                                                                                | tpch1.partsupp.ps_partkey, Column#33                                                                                                                       | 221.0 KB  | N/A     |
| └─Sort_64                                  | 257648.92  | 8685    | root      |                | time:1.27s, loops:10                                                                                                                                                                                                                                                                               | Column#33:desc                                                                                                                                             | 485.7 KB  | 0 Bytes |
|   └─Selection_66                           | 257648.92  | 8685    | root      |                | time:1.26s, loops:10                                                                                                                                                                                                                                                                               | gt(Column#33, 8102913.765246800000)                                                                                                                        | 48.5 KB   | N/A     |
|     └─HashAgg_67                           | 322061.15  | 304774  | root      |                | time:1.25s, loops:300                                                                                                                                                                                                                                                                              | group by:Column#58, funcs:sum(Column#56)->Column#33, funcs:firstrow(Column#57)->tpch1.partsupp.ps_partkey                                                  | 66.5 MB   | 0 Bytes |
|       └─Projection_94                      | 322061.15  | 323920  | root      |                | time:877.6ms, loops:319, Concurrency:3                                                                                                                                                                                                                                                             | mul(tpch1.partsupp.ps_supplycost, cast(tpch1.partsupp.ps_availqty, decimal(20,0) BINARY))->Column#56, tpch1.partsupp.ps_partkey, tpch1.partsupp.ps_partkey | 355.5 KB  | N/A     |
|         └─HashJoin_71                      | 322061.15  | 323920  | root      |                | time:1.17s, loops:319, build_hash_table:{total:7.12ms, fetch:6.22ms, build:901.5µs}, probe:{concurrency:3, total:3.62s, max:1.21s, probe:610.4ms, fetch:3.01s}                                                                                                                                     | inner join, equal:[eq(tpch1.supplier.s_suppkey, tpch1.partsupp.ps_suppkey)]                                                                                | 308.5 KB  | 0 Bytes |
|           ├─HashJoin_84(Build)             | 4000.00    | 4049    | root      |                | time:6.57ms, loops:7, build_hash_table:{total:596.2µs, fetch:590.9µs, build:5.28µs}, probe:{concurrency:3, total:19.6ms, max:6.53ms, probe:5.04ms, fetch:14.5ms}                                                                                                                                   | inner join, equal:[eq(tpch1.nation.n_nationkey, tpch1.supplier.s_nationkey)]                                                                               | 25.7 KB   | 0 Bytes |
|           │ ├─TableReader_89(Build)        | 1.00       | 1       | root      |                | time:577.5µs, loops:2, cop_task: {num: 1, max: 1.01ms, proc_keys: 25, rpc_num: 1, rpc_time: 977.4µs, copr_cache_hit_ratio: 0.00, distsql_concurrency: 5}                                                                                                                                           | data:Selection_88                                                                                                                                          | 317 Bytes | N/A     |
|           │ │ └─Selection_88               | 1.00       | 1       | cop[tikv] |                | tikv_task:{time:1ms, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 30.1µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 8}}}                                                                                      | eq(tpch1.nation.n_name, "GERMANY")                                                                                                                         | N/A       | N/A     |
|           │ │   └─TableFullScan_87         | 25.00      | 25      | cop[tikv] | table:nation   | tikv_task:{time:0s, loops:1}                                                                                                                                                                                                                                                                       | keep order:false                                                                                                                                           | N/A       | N/A     |
|           │ └─TableReader_86(Probe)        | 100000.00  | 100000  | root      |                | time:4.23ms, loops:100, cop_task: {num: 10, max: 1.02ms, min: 209.3µs, avg: 506.7µs, p95: 1.02ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 4.94ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 5}                                                                       | data:TableFullScan_85                                                                                                                                      | 964.4 KB  | N/A     |
|           │   └─TableFullScan_85           | 100000.00  | 100000  | cop[tikv] | table:supplier | tikv_task:{proc max:637ms, min:1ms, avg: 160.4ms, p80:491ms, p95:637ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 102.5µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 10}}}                 | keep order:false                                                                                                                                           | N/A       | N/A     |
|           └─TableReader_91(Probe)          | 8000000.00 | 8000000 | root      |                | time:947.7ms, loops:7840, cop_task: {num: 292, max: 47.8ms, min: 171.1µs, avg: 15.6ms, p95: 37.7ms, max_proc_keys: 50144, p95_proc_keys: 50144, tot_proc: 3.41s, tot_wait: 336ms, rpc_num: 292, rpc_time: 4.55s, copr_cache_hit_ratio: 0.12, distsql_concurrency: 5}                               | data:TableFullScan_90                                                                                                                                      | 11.3 MB   | N/A     |
|             └─TableFullScan_90             | 8000000.00 | 8000000 | cop[tikv] | table:partsupp | tikv_task:{proc max:43ms, min:0s, avg: 11.5ms, p80:22ms, p95:27ms, iters:8969, tasks:292}, scan_detail: {total_process_keys: 7206272, total_process_keys_size: 1336557763, total_keys: 7206528, get_snapshot_time: 5.21ms, rocksdb: {key_skipped_count: 7206272, block: {cache_hit_count: 23171}}} | keep order:false                                                                                                                                           | N/A       | N/A     |
+--------------------------------------------+------------+---------+-----------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
14 rows in set (2.35 sec)

The trace plan information is as follows:
trace_plan.json (61.3 KB)

Why does it take more than 1 second to generate the execution plan?

| username: Lucien-卢西恩 | Original post link

@人如其名 Can you provide the corresponding JSON analysis screenshot? I couldn’t parse it successfully locally. Additionally, it would be best to provide the table schema information. I can try to reproduce it in my test environment. If the issue is with generating the execution plan slowly and it’s an Optimizer problem, I can reproduce it.

| username: 人如其名 | Original post link

Hello, you can directly generate TPC-H to reproduce (specify sf=1, it will be completed quickly):

tiup bench tpch -D tpch -H 192.168.31.201 -P 5432 -U root -p root --sf=1 prepare -T 5 --dropdata

Then execute the above explain information.

You can also use the following table structures:

mysql> show create table partsupp;
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                                                                                                                                                                                                                                                                                                                                    |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| partsupp | CREATE TABLE `partsupp` (
  `PS_PARTKEY` bigint(20) NOT NULL,
  `PS_SUPPKEY` bigint(20) NOT NULL,
  `PS_AVAILQTY` bigint(20) NOT NULL,
  `PS_SUPPLYCOST` decimal(15,2) NOT NULL,
  `PS_COMMENT` varchar(199) NOT NULL,
  PRIMARY KEY (`PS_PARTKEY`,`PS_SUPPKEY`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table supplier;
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table    | Create Table                                                                                                                                                                                                                                                                                                                                                                                             |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| supplier | CREATE TABLE `supplier` (
  `S_SUPPKEY` bigint(20) NOT NULL,
  `S_NAME` char(25) NOT NULL,
  `S_ADDRESS` varchar(40) NOT NULL,
  `S_NATIONKEY` bigint(20) NOT NULL,
  `S_PHONE` char(15) NOT NULL,
  `S_ACCTBAL` decimal(15,2) NOT NULL,
  `S_COMMENT` varchar(101) NOT NULL,
  PRIMARY KEY (`S_SUPPKEY`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> show create table nation;
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                          |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| nation | CREATE TABLE `nation` (
  `N_NATIONKEY` bigint(20) NOT NULL,
  `N_NAME` char(25) NOT NULL,
  `N_REGIONKEY` bigint(20) NOT NULL,
  `N_COMMENT` varchar(152) DEFAULT NULL,
  PRIMARY KEY (`N_NATIONKEY`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
| username: 人如其名 | Original post link

I tested it myself, and it couldn’t be reproduced with an empty table. It should be related to the statistics. A complete reproduction is still needed: tiup bench tpch -D tpch -H 192.168.31.201 -P 5432 -U root -p root --sf=10 prepare -T 5 --dropdata --analyze

| username: 人如其名 | Original post link

It seems to be related to the subquery in the HAVING clause. Here are two tests:

  1. Removing the WHERE condition in the subquery to form a Cartesian product query causes the execution plan to run indefinitely without producing results:
explain
select 
        ps_partkey,
        sum(ps_supplycost * ps_availqty) as value
from
        partsupp,
        supplier,
        nation
where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'GERMANY'
group by
        ps_partkey having
                sum(ps_supplycost * ps_availqty) > (
                        select
                                sum(ps_supplycost * ps_availqty) * 0.0000100000
                        from
                                partsupp,
                                supplier,
                                nation
--                        where
--                                ps_suppkey = s_suppkey
--                                and s_nationkey = n_nationkey
--                                and n_name = 'GERMANY'
                )
order by
        value desc;


  1. Removing the join conditions in the subquery and keeping only partsupp produces results instantly:
explain
select 
        ps_partkey,
        sum(ps_supplycost * ps_availqty) as value
from
        partsupp,
        supplier,
        nation
where
        ps_suppkey = s_suppkey
        and s_nationkey = n_nationkey
        and n_name = 'GERMANY'
group by
        ps_partkey having
                sum(ps_supplycost * ps_availqty) > (
                        select
                                sum(ps_supplycost * ps_availqty) * 0.0000100000
                        from
                                partsupp
                )
order by
        value desc;

| username: tiancaiamao | Original post link

The reason is that this is a non-correlated subquery. TiDB executes this subquery during the compilation phase and obtains the result. Then, it uses the obtained result to continue constructing the plan and then executes it…

If you observe the plan carefully, you will find,

|   └─Selection_66                           | 257648.92  | root      |                | gt(Column#33, 8102913.765246800000)

The value 8102913.765246800000 inside this is the result calculated in advance by the subquery. The condition sum(ps_supplycost * ps_availqty) > (select subquery) is transformed into gt(Column#33, 8102913.765246800000).

So, the reason you see the plan generation time being too long is that the subquery SQL is executed during the plan generation process, and the time consumed is counted in the plan generation. This issue has been raised many times and has not been resolved, currently feeling a bit difficult to overcome…

| username: 人如其名 | Original post link

I see, thank you, expert!

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.