Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: HashJoin占用内存过大,落盘不及时

[TiDB Usage Environment] Poc
[TiDB Version] 6.5.0
[Reproduction Path] Components involved: tidb, tikv. When executing the TPCH Q5 statement with default parameters, it was found that the statement-level OOM always occurred under the wrong execution plan.
The Q5 statement is originally intended to perform grouping and aggregation on the result set, etc., for simplification. The test statement and execution plan are as follows (since the tables are relatively small, no statistics were collected for the nation and region tables, while other tables were collected, leading to the wrong execution plan):
select
count(*)
from
customer,
orders,
lineitem,
supplier,
nation,
region
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and l_suppkey = s_suppkey
and c_nationkey = s_nationkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'ASIA'
and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_24 | 1.00 | root | | funcs:count(1)->Column#48 |
| └─IndexJoin_30 | 8738986.41 | root | | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) |
| ├─HashJoin_37(Build) | 2196527.02 | root | | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)] |
| │ ├─HashJoin_39(Build) | 1431131.00 | root | | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] |
| │ │ ├─HashJoin_41(Build) | 25.00 | root | | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)] |
| │ │ │ ├─HashJoin_54(Build) | 0.01 | root | | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)] |
| │ │ │ │ ├─TableReader_59(Build) | 0.01 | root | | data:Selection_58 |
| │ │ │ │ │ └─Selection_58 | 0.01 | cop[tikv] | | eq(tpch.region.r_name, "ASIA") |
| │ │ │ │ │ └─TableFullScan_57 | 5.00 | cop[tikv] | table:region | keep order:false, stats:pseudo |
| │ │ │ │ └─TableReader_56(Probe) | 25.00 | root | | data:TableFullScan_55 |
| │ │ │ │ └─TableFullScan_55 | 25.00 | cop[tikv] | table:nation | keep order:false, stats:pseudo |
| │ │ │ └─TableReader_61(Probe) | 100000.00 | root | | data:TableFullScan_60 |
| │ │ │ └─TableFullScan_60 | 100000.00 | cop[tikv] | table:supplier | keep order:false |
| │ │ └─TableReader_63(Probe) | 1500000.00 | root | | data:TableFullScan_62 |
| │ │ └─TableFullScan_62 | 1500000.00 | cop[tikv] | table:customer | keep order:false |
| │ └─TableReader_66(Probe) | 2196527.02 | root | | data:Selection_65 |
| │ └─Selection_65 | 2196527.02 | cop[tikv] | | ge(tpch.orders.o_orderdate, 1994-01-01), lt(tpch.orders.o_orderdate, 1995-01-01) |
| │ └─TableFullScan_64 | 14457280.00 | cop[tikv] | table:orders | keep order:false |
| └─TableReader_27(Probe) | 2196527.02 | root | | data:TableRangeScan_26 |
| └─TableRangeScan_26 | 2196527.02 | cop[tikv] | table:lineitem | range: decided by [eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)], keep order:false |
+------------------------------------------------+-------------+-----------+----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
20 rows in set (0.00 sec)
When the parameters are set to default, the memory-related parameters are:
mysql> show variables like 'tidb_mem_quota_query';
+----------------------+------------+
| Variable_name | Value |
+----------------------+------------+
| tidb_mem_quota_query | 1073741824 |
+----------------------+------------+
1 row in set (0.01 sec)
mysql> show variables like '%oom%';
+--------------------------------+--------+
| Variable_name | Value |
+--------------------------------+--------+
| tidb_enable_tmp_storage_on_oom | ON |
| tidb_mem_oom_action | CANCEL |
+--------------------------------+--------+
2 rows in set (0.00 sec)
mysql> show config where name like 'tmp-storage%';
+------+---------------------+-------------------+-----------------------------------------------------------------+
| Type | Instance | Name | Value |
+------+---------------------+-------------------+-----------------------------------------------------------------+
| tidb | 192.168.31.201:4002 | tmp-storage-path | /tmp/1000_tidb/MC4wLjAuMDo0MDAyLzAuMC4wLjA6MTAwODI=/tmp-storage |
| tidb | 192.168.31.201:4002 | tmp-storage-quota | -1 |
+------+---------------------+-------------------+-----------------------------------------------------------------+
2 rows in set (0.01 sec)
Executing the statement:
mysql> explain analyze select count(*) from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey
and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 1105 (HY000): context canceled
mysql> explain analyze select count(*) from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey
and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year;
ERROR 2013 (HY000): Lost connection to MySQL server during query
No connection. Trying to reconnect...
Connection id: 415
Current database: tpch
ERROR 1105 (HY000): Out Of Memory Quota![conn_id=293158387267404191]
As you can see, OOM occurred. Here, I have already set up disk spilling, but it was still canceled!
Modify the parameter set global tidb_mem_oom_action=LOG; to only log when memory exceeds, without killing the statement. However, according to the OOM priority, it will still try to spill to disk as much as possible. Here, observe how much memory the statement actually uses during execution. The explain analyze execution result is as follows:
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------------------------+-------------+------------+-----------+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+
| HashAgg_24 | 1.00 | 1 | root | | time:40m16.7s, loops:2, partial_worker:{wall_time:40m16.708802251s, concurrency:5, task_num:72, tot_wait:3h21m23.538992743s, tot_exec:2.902767ms, tot_time:3h21m23.543541588s, max:40m16.708722824s, p95:40m16.708722824s}, final_worker:{wall_time:40m16.709129743s, concurrency:5, task_num:5, tot_wait:3h21m23.543673912s, tot_exec:342.368µs, tot_time:3h21m23.544019075s, max:40m16.709041758s, p95:40m16.709041758s} | funcs:count(1)->Column#48 | 165.3 KB | N/A |
| └─IndexJoin_30 | 8738986.41 | 72985 | root | | time:40m16.7s, loops:73, inner:{total:12m43.5s, concurrency:5, task:71508, construct:10m57.2s, fetch:1m44.8s, build:1.33s}, probe:2m15s | inner join, inner:TableReader_27, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, equal cond:eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey), eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey) | 10.2 MB | N/A |
| ├─HashJoin_37(Build) | 2196527.02 | 1830400706 | root | | time:39m51.8s, loops:1787505, build_hash_table:{total:15m1s, fetch:11m33.8s, build:3m27.3s}, probe:{concurrency:5, total:3h21m17.6s, max:40m16.7s, probe:2h6m12.3s, fetch:1h15m5.3s} | inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)] | 17.9 GB | 44.8 GB |
| │ ├─HashJoin_39(Build) | 1431131.00 | 1203307921 | root | | time:12m31.8s, loops:1175109, build_hash_table:{total:15.5ms, fetch:13.4ms, build:2.11ms}, probe:{concurrency:5, total:1h15m1.6s, max:15m1s, probe:1h15m1.3s, fetch:245ms} | inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)] | 846.4 KB | 782.7 KB |
| │ │ ├─HashJoin_41(Build) | 25.00 | 20037 | root | | time:14.5ms, loops:23, build_hash_table:{total:2.36ms, fetch:2.35ms, build:10.1µs}, probe:{concurrency:5, total:75.2ms, max:15.2ms, probe:15ms, fetch:60.2ms} | inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)] | 1.59 KB | 120 Bytes |
| │ │ │ ├─HashJoin_54(Build) | 0.01 | 5 | root | | time:2.34ms, loops:2, build_hash_table:{total:2.14ms, fetch:2.13ms, build:10.1µs}, probe:{concurrency:5, total:10.7ms, max:2.22ms, probe:66.6µs, fetch:10.6ms} | inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)] | 31.1 KB | 36 Bytes |
| │ │ │ │ ├─TableReader_59(Build) | 0.01 | 1 | root | | time:2.11ms, loops:2, cop_task: {num: 1, max: 2.6ms, proc_keys: 5, rpc_num: 1, rpc_time: 2.57ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:Selection_58 | 312 Bytes | N/A |
| │ │ │ │ │ └─Selection_58 | 0.01 | 1 | cop[tikv] | | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 5, total_process_keys_size: 559, total_keys: 6, get_snapshot_time: 3.1µs, rocksdb: {key_skipped_count: 5, block: {cache_hit_count: 4}}} | eq(tpch.region.r_name, "ASIA") | N/A | N/A |
| │ │ │ │ │ └─TableFullScan_57 | 5.00 | 5 | cop[tikv] | table:region | tikv_task:{time:0s, loops:1} | keep order:false, stats:pseudo | N/A | N/A |
| │ │ │ │ └─TableReader_56(Probe) | 25.00 | 25 | root | | time:469.1µs, loops:2, cop_task: {num: 1, max: 961.6µs, proc_keys: 25, rpc_num: 1, rpc_time: 938.3µs, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | data:TableFullScan_55 | 679 Bytes | N/A |
| │ │ │ │ └─TableFullScan_55 | 25.00 | 25 | cop[tikv] | table:nation | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 25, total_process_keys_size: 3109, total_keys: 26, get_snapshot_time: 7.15µs, rocksdb: {key_skipped_count: 25, block: {cache_hit_count: 4}}} | keep order:false, stats:pseudo | N/A | N/A |
| │ │ │ └─TableReader_61(Probe) | 100000.00 | 100000 | root | | time:11.8ms, loops:100, cop_task: {num: 10, max: 2.55ms, min: 770.1µs, avg: 1.29ms, p95: 2.55ms, max_proc_keys: 480, p95_proc_keys: 480, rpc_num: 10, rpc_time: 12.8ms, copr_cache_hit_ratio: 0.90, distsql_concurrency: 15} | data:TableFullScan_60 | 964.4 KB | N/A |
| │ │ │ └─TableFullScan_60 | 100000.00 | 100000 | cop[tikv] | table:supplier | tikv_task:{proc max:40ms, min:1ms, avg: 11ms, p80:20ms, p95:40ms, iters:137, tasks:10}, scan_detail: {total_process_keys: 480, total_process_keys_size: 87745, total_keys: 481, get_snapshot_time: 254.5µs, rocksdb: {key_skipped_count: 480, block: {cache_hit_count: 6}}} | keep order:false | N/A | N/A |
| │ │ └─TableReader_63(Probe) | 1500000.00 | 1500000 | root | | time:153ms, loops:1468, cop_task: {num: 54, max: 31ms, min: 242