Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: tidb-server 5.3.0 内存控制不生效
[TiDB Usage Environment] Production Environment
[TiDB Version] 5.3.0
[Encountered Problem] Running a very slow SQL on tidb-server (joining a table with 80 million rows and another with 100 million rows) causes OOM. Memory control settings such as mem-quota-query and oom-action are not effective.
Configuration is as follows:
server_configs:
tidb:
log.slow-threshold: 300
mem-quota-query: 8589934592
oom-action: cancel
oom-use-tmp-storage: true
performance.server-memory-quota: 30212254720
performance.txn-total-size-limit: 2173741824
tmp-storage-path: /home/shared/tmp-storage
tidb-server log:
[2022/09/28 16:31:38.100 +08:00] [WARN] [memory_usage_alarm.go:140] [“tidb-server has the risk of OOM. Running SQLs and heap profile will be recorded in record path”] [“is server-memory-quota set”=true] [server-memory-quota=30212254720] [“tidb-server memory usage”=24235258432] [memory-usage-alarm-ratio=0.8] [“record path”=“/home/shared/tmp-storage/1004_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage/record”]
[2022/09/28 16:31:46.500 +08:00] [WARN] [expensivequery.go:179] [expensive_query] [cost_time=60.07822628s] [cop_time=12.266157414s] [process_time=287.003s] [wait_time=0.973s] [request_count=374] [total_keys=347676927] [process_keys=174355378] [num_cop_tasks=374] [process_avg_time=0.7673877s] [process_p90_time=1.313s] [process_max_time=2.1390000000000002s] [process_max_addr=dx-op-tidb-tikv3-pm:20160] [wait_avg_time=0.002601604s] [wait_p90_time=0.003s] [wait_max_time=0.095s] [wait_max_addr=dx-op-tidb-tikv3-pm:20160] [stats=ods_bort_trade_t_order_item_part:436299555451174938,ods_bort_trade_t_order_part:436300359591002118] [conn_id=3] [user=dbadmin_01] [database=pipe_data_realtime] [table_ids=“[18951,78473,18951,78473]”] [txn_start_ts=436300374716710913] [mem_max=“10375289917 Bytes (9.66 GB)”] [sql=“select 0 as id, scencekey ,10001 as resourceType, productid as resourceId, bizid as bizId, sum(quantity) as soldCount, CURRENT_timestamp() as dbctime, CURRENT_timestamp() as dbutime from (select id,scencekey from pipe_data_realtime.ods_bort_trade_t_order_part where state <> ‘4’) a join (select orderitemid, orderid, productid, variantid, quantity, bizid, refundstate, transferstate from pipe_data_realtime.ods_bort_trade_t_order_item_part where refundstate <> 2 and transferstate <> 1 )b on a.id = b.orderid group by scencekey, bizid , productid union all select 0 as id, scencekey ,10002 as resourceType, variantid as resourceId, bizid as bizId, sum(quantity) as soldCount , CURRENT_timestamp() as dbctime, CURRENT_timestamp() as dbutime from(select id,scencekey from pipe_data_realtime.ods_bort_trade_t_order_part where state <> ‘4’) a join (select orderitemid, orderid, productid, variantid, quantity, bizid, refundstate, transferstate from pipe_data_realtime.ods_bort_trade_t_order_item_part where refundstate <> 2 and transferstate <> 1 ) b on a.id = b.orderid group by scencekey, bizid , variantid”]
[2022/09/28 16:31:46.931 +08:00] [INFO] [coprocessor.go:1149] [“memory exceeds quota, destroy one token now.”] [consumed=10378389257] [quota=8589934592] [“total token count”=80] [“remaining token count”=37]
[Expected Outcome]
tidb-server should kill SQL queries that exceed memory limits according to the configuration, instead of causing OOM.