Large Query Causes System OOM

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 超大查询导致系统oom

| username: TiDBer_BraiRIcV

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.4.2

Question: A large SQL query in TiDB can exhaust the system memory and cause an OOM. Is there a parameter to control this and prevent OOM?

  1. First Scenario
    Without using a hint, running the SQL directly. Since the SQL exceeds the set mem-quota-query=500M limit, it directly prompts “Out Of Memory Quota!” indicating that the mem-quota-query parameter is effective, which is a memory protection mechanism of TiDB, no problem here.

[SQL]SELECT
        wgc.*, wu.nickname AS 'sendUserNickName', wu.channel_id as register_channel_id,
        b.charm_score * wgc.gift_amount as charm_score from weplay_gift_consume wgc
        LEFT JOIN weplay_user wu ON wgc.uid = wu.uid
        LEFT JOIN weplay_gift_config b ON wgc.gift_id = b.id
        ORDER BY wgc.create_time DESC

[Err] 1105 - Out Of Memory Quota![conn_id=5]
  1. Second Scenario
    Using a hint, the SQL keeps running until the system OOMs. Is there a parameter to control this to prevent the SQL from occupying all system memory and causing an OOM?

SELECT
        /*+ MERGE_JOIN(wu) */
        wgc.*, wu.nickname AS 'sendUserNickName', wu.channel_id as register_channel_id,
        b.charm_score * wgc.gift_amount as charm_score from weplay_gift_consume wgc
        LEFT JOIN weplay_user wu ON wgc.uid = wu.uid
        LEFT JOIN weplay_gift_config b ON wgc.gift_id = b.id
        ORDER BY wgc.create_time DESC

| username: Billmay表妹 | Original post link

Check this out: 专栏 - tidb server的oom问题优化探索 | TiDB 社区

| username: TiDBer_BraiRIcV | Original post link

Thank you for sharing, although it didn’t solve the problem.

| username: Kongdom | Original post link

Do you mean that when using hints, it is not subject to the mem-quota-query limit?

| username: buddyyuan | Original post link

Explain the execution plan of the second one.

| username: TiDBer_BraiRIcV | Original post link

Yes, but it keeps finding the server OOM.

| username: TiDBer_BraiRIcV | Original post link

Pictures20221109175248.bmp (3.0 MB)

| username: TiDBer_BraiRIcV | Original post link

Projection_75 102550875.22 root weplayzy.weplay_gift_consume.id, weplayzy.weplay_gift_consume.uid, weplayzy.weplay_gift_consume.recipient_id, weplayzy.weplay_gift_consume.recipient_name, weplayzy.weplay_gift_consume.gift_id, weplayzy.weplay_gift_consume.gift_name, weplayzy.weplay_gift_consume.gift_type, weplayzy.weplay_gift_consume.gift_amount, weplayzy.weplay_gift_consume.gift_icon, weplayzy.weplay_gift_consume.gift_price, weplayzy.weplay_gift_consume.host_id, weplayzy.weplay_gift_consume.room_id, weplayzy.weplay_gift_consume.diamond_amount, weplayzy.weplay_gift_consume.wealth_amount, weplayzy.weplay_gift_consume.remark, weplayzy.weplay_gift_consume.data_source, weplayzy.weplay_gift_consume.channel_id, weplayzy.weplay_gift_consume.create_time, weplayzy.weplay_gift_consume.gift_source, weplayzy.weplay_gift_consume.gift_order_id, weplayzy.weplay_user.nickname, weplayzy.weplay_user.channel_id, mul(weplayzy.weplay_gift_config.charm_score, weplayzy.weplay_gift_consume.gift_amount)->Column#115
└─IndexJoin_81 102550875.22 root left outer join, inner:TableReader_78, outer key:weplayzy.weplay_gift_consume.gift_id, inner key:weplayzy.weplay_gift_config.id, equal cond:eq(weplayzy.weplay_gift_consume.gift_id, weplayzy.weplay_gift_config.id)
  ├─IndexJoin_93(Build) 102550875.22 root left outer join, inner:TableReader_90, outer key:weplayzy.weplay_gift_consume.uid, inner key:weplayzy.weplay_user.uid, equal cond:eq(weplayzy.weplay_gift_consume.uid, weplayzy.weplay_user.uid)
  │ ├─IndexLookUp_99(Build) 102158297.00 root
  │ │ ├─IndexFullScan_97(Build) 102158297.00 cop[tikv] table:wgc, index:idx_create_time(create_time) keep order:true, desc
  │ │ └─TableRowIDScan_98(Probe) 102158297.00 cop[tikv] table:wgc keep order:false
  │ └─TableReader_90(Probe) 1.00 root data:TableRangeScan_89
  │   └─TableRangeScan_89 1.00 cop[tikv] table:wu range: decided by [weplayzy.weplay_gift_consume.uid], keep order:false
  └─TableReader_78(Probe) 0.96 root data:TableRangeScan_77
    └─TableRangeScan_77 0.96 cop[tikv] table:b range: decided by [weplayzy.weplay_gift_consume.gift_id], keep order:false
| username: Raymond | Original post link

  1. Try turning off the tidb_enable_rate_limit_action variable.
  2. Enable the data persistence feature.
  3. Try lowering the tidb_distsql_scan_concurrency (set it to 10, 8, 6, or even lower).

You can first try 1 and 2, and if it doesn’t work, add 3 and try again.

| username: TiDBer_BraiRIcV | Original post link

1, 2, and 3 are all set.

set tidb_enable_rate_limit_action = 0;
set tidb_distsql_scan_concurrency = 1;

The system memory is not increasing as fast as before, but it is still slowly rising. However, after running for nearly 10 minutes, the memory is already full and eventually OOM.

It seems that TiDB does not yet have a “disk for time” mechanism like MySQL. Although MySQL is slow, it can always complete the task. TiDB either prompts “out of memory quota” or eventually causes the system to OOM. Large SQL queries are unmanageable???

[2022/11/10 11:17:10.053 +08:00] [INFO] [row_container.go:368] ["memory exceeds quota, spill to disk now."] [consumed=524678879] [quota=524288000]
[2022/11/10 11:17:10.053 +08:00] [INFO] [row_container.go:562] ["memory exceeds quota, spill to disk now."] [consumed=524638922] [quota=524288000]
[2022/11/10 11:17:10.058 +08:00] [INFO] [row_container.go:368] ["memory exceeds quota, spill to disk now."] [consumed=525997030] [quota=524288000]
[2022/11/10 11:17:10.669 +08:00] [INFO] [row_container.go:368] ["memory exceeds quota, spill to disk now."] [consumed=524618199] [quota=524288000]
[2022/11/10 11:17:10.677 +08:00] [WARN] [expensivequery.go:179] [expensive_query] [cost_time=15.470587844s] [cop_time=44.804289864s] [process_time=145.775s] [wait_time=5.234s] [request_count=21621] [total_keys=7242446] [process_keys=6781539] [num_cop_tasks=21621] [process_avg_time=0.006742287s] [process_p90_time=0.017s] [process_max_time=0.887s] [process_max_addr=10.0.8.39:20160] [wait_avg_time=0.000242079s] [wait_p90_time=0.001s] [wait_max_time=0.091s] [wait_max_addr=10.0.8.39:20161] [stats=weplay_user:437269348093526017,weplay_gift_config:437246981652611087,weplay_gift_consume:437268436107395086] [conn_id=29] [user=root] [database=weplayzy] [table_ids="[1008,308,1012]"] [index_names="[weplay_gift_consume:idx_uid]"] [txn_start_ts=437269355603689479] [mem_max="658532008 Bytes (628.0 MB)"] [sql="SELECT\r\n        /*+ MERGE_JOIN(wu) */\r\n        wgc.*,wu.nickname AS 'sendUserNickName',wu.channel_id as register_channel_id,\r\n        b.charm_score*wgc.gift_amount as charm_score from weplay_gift_consume wgc\r\n        LEFT JOIN weplay_user wu ON wgc.uid = wu.uid\r\n        LEFT JOIN weplay_gift_config b ON wgc.gift_id = b.id\r\n        ORDER BY wgc.create_time DESC"]
| username: 裤衩儿飞上天 | Original post link

What business scenario requires querying over 100 million pieces of data each time? It seems optimization should be approached from the business level.

| username: TiDBer_BraiRIcV | Original post link

Yes, in practical applications, time field filtering is controlled, and data within a week can be queried within 2 million rows. (You need to add a hint (/*+ MERGE_JOIN(wu) */), without the hint it prompts out of memory quota)

Setting aside the application scenario. For large SQL queries, TiDB’s choices are either out of memory quota or system OOM, seemingly without an efficient disk swap mechanism (although it does spill data to disk, it still consumes memory and eventually OOMs). What I want to achieve is for large SQL queries to be able to run on TiDB.

| username: buddyyuan | Original post link

tidb_mem_quota_query
tidb_mem_oom_action
oom-use-tmp-storage
tmp-storage-quota

Take a look at these variables and parameters, what are they set to?

| username: TiDBer_BraiRIcV | Original post link

MySQL [(none)]> show variables like 'tidb_mem_quota_query%';
+----------------------+-----------+
| Variable_name        | Value     |
+----------------------+-----------+
| tidb_mem_quota_query | 524288000 |
+----------------------+-----------+
1 row in set (0.020 sec)

MySQL [(none)]> show config where name like '%oom_action%';
+------+-----------------+------------+--------+
| Type | Instance        | Name       | Value  |
+------+-----------------+------------+--------+
| tidb | 10.0.8.183:4000 | oom-action | cancel |
| tidb | 10.0.8.45:4000  | oom-action | cancel |
| tidb | 10.0.8.39:4000  | oom-action | cancel |
+------+-----------------+------------+--------+
3 rows in set (0.019 sec)

MySQL [(none)]> show config where name like '%tmp-storage%';
+------+-----------------+---------------------+--------------------------------------------------------------+
| Type | Instance        | Name                | Value                                                        |
+------+-----------------+---------------------+--------------------------------------------------------------+
| tidb | 10.0.8.183:4000 | oom-use-tmp-storage | true                                                         |
| tidb | 10.0.8.183:4000 | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.0.8.183:4000 | tmp-storage-quota   | -1                                                           |
| tidb | 10.0.8.45:4000  | oom-use-tmp-storage | true                                                         |
| tidb | 10.0.8.45:4000  | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.0.8.45:4000  | tmp-storage-quota   | -1                                                           |
| tidb | 10.0.8.39:4000  | oom-use-tmp-storage | true                                                         |
| tidb | 10.0.8.39:4000  | tmp-storage-path    | /tmp/0_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage |
| tidb | 10.0.8.39:4000  | tmp-storage-quota   | -1                                                           |
+------+-----------------+---------------------+--------------------------------------------------------------+
9 rows in set (0.022 sec)

MySQL [(none)]>  
| username: buddyyuan | Original post link

What disk is your /tmp on?
During the process of operators writing to disk, you need to consider temp I/O issues. If the write speed is not fast enough, it will still accumulate in memory, leading to OOM (Out of Memory).

| username: TiDBer_BraiRIcV | Original post link

It’s SSD.

| username: 裤衩儿飞上天 | Original post link

How much total memory does TiDB have, how much tmp is there, and how large is 100 million+ data? If all conditions are met, it should be able to run to completion. If one condition is not met, OOM is normal.

| username: TiDBer_BraiRIcV | Original post link

tmp has been adjusted to the SSD data disk. The space of 400G is sufficient.
After running for more than 10 minutes, the system encountered an OOM (Out of Memory) error.

top - 15:05:27 up 54 days, 22:22,  3 users,  load average: 69.11, 29.01, 23.99
Tasks: 247 total,   2 running, 245 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us, 48.4 sy,  0.0 ni,  0.5 id, 48.8 wa,  0.0 hi,  2.1 si,  0.0 st
KiB Mem : 99.3/32778404 [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
KiB Swap:  0.0/0        [                                                                                                    ]
     

TiDB logs (from the start to the OOM and the entire TiDB restart log)

tidb.log (544.6 KB)

| username: Raymond | Original post link

How much memory does your TiDB server process use?

| username: Raymond | Original post link

  1. Try turning off the tidb_enable_rate_limit_action variable.
  2. Enable the data persistence feature.
  3. Try lowering the tidb_distsql_scan_concurrency (set it to around 2).
  4. Set the tidb_enable_chunk_rpc system variable to OFF (this may reduce statement execution efficiency).

Ultimately, it is still recommended to optimize the statements.