TiDB Cluster: Single SQL Session Query Exceeds Parameter Limit Without OOM Error, OOM Occurs Only When Memory Reaches 37GB

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb集群,单个sql会话查询超过参数限制,并没有报出OOM,内存达到37G时,才发生OOM

| username: vcdog

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact]

TiDB cluster, single SQL session query exceeds parameter limit but still queries

Following up on the previous post, a new issue has been discovered:

  1. Parameter configuration of Tidb-server in the production environment TiDB cluster:
    mem-quota-query: 16g
    performance.txn-total-size-limit: 4g
    set global tidb_mem_oom_action=‘cancel’;

Under the current settings of these three configuration parameters, it was found that when a single SQL query is executed, the memory consumption does not report OOM at 16G, but rather at around 37G.


Monitoring screenshot as follows:

Therefore, I would like to ask for an explanation of this phenomenon. Why does the memory consumption report OOM not just after exceeding 16G or 4G?

[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: Raymond | Original post link

How much did you set for the system variable tidb_mem_quota_query?

| username: 人如其名 | Original post link

Post the execution plan.

| username: tidb菜鸟一只 | Original post link

Have you set this parameter?

| username: vcdog | Original post link

The default value of tidb_enable_clustered_index is INT_ONLY. This means that the clustered index is enabled by default only for tables with integer primary keys.

| username: vcdog | Original post link

The image is not visible. Please provide the text you need translated.

| username: vcdog | Original post link

This parameter, as far as I remember, has not been adjusted after the upgrade. Currently, it is in the “on” state. I will try to adjust it to “off” and test it again.

Yesterday, there was another SQL operation in the production environment, which was an update operation. A single update statement occupied 41G of memory.

| username: tidb菜鸟一只 | Original post link

You can try changing it. I remember someone on the forum had a similar issue because of this parameter.

| username: xingzhenxiang | Original post link

I usually use a script to kill processes.

for list in `/server/mysql5.7/bin/mysql -hXXX.XXX.XXX.XXX -p'PASSWORD' -vvv -e "select id from INFORMATION_SCHEMA.processlist a where a.info is not null and (mem >= 11474836480 or time > 600);" | grep -Ev 'id|ID|iD|Id' | awk -F "|" '{print $2}'`
do 
    echo $list
    /server/mysql5.7/bin/mysql -hXXX.XXX.XXX.XXX -p'PASSWORD' -vvv -e "select id, time, info, mem from INFORMATION_SCHEMA.processlist a where id=$list and a.info is not null;" > /sh/killtestlog/`date +%s`.log
    /server/mysql5.7/bin/mysql -hXXX.XXX.XXX.XXX -p'PASSWORD' -vvv -e "kill tidb $list;"
done;