TiDB Cluster: Single SQL Session Query Exceeds Parameter Limit but Still Executes

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb集群,单个sql会话查询超过参数限制,仍能查询

| username: vcdog

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Phenomenon and Impact]
Parameter configuration of Tidb-server in the production environment TiDB cluster:
mem-quota-query: 16g
performance.txn-total-size-limit: 4g

However, there is a situation where a user initiates a large SQL query, consuming memory up to 60g+.

Under normal circumstances, if it exceeds 16g, the tidb-server should actively interrupt the user’s connection query. However, as seen from the screenshot above, it has not been released, eventually leading to OOM.

[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: Billmay表妹 | Original post link

When the memory usage of a single SQL session exceeds the value set by the mem-quota-query parameter, TiDB will proactively interrupt that session. However, the screenshot you provided shows that the session was never released, eventually leading to an OOM (Out of Memory) situation.

This issue may be due to improper parameter settings in TiDB. You can try adjusting the mem-quota-query parameter value to ensure that the memory usage of a single SQL session does not exceed TiDB’s memory limit. Additionally, you can try adjusting the performance.txn-total-size-limit parameter value to ensure that the transaction memory usage in TiDB does not exceed TiDB’s memory limit.

If you have already tried the above methods but still encounter issues, you might consider using TiDB’s Profile feature to analyze the performance bottlenecks of that SQL session. The Profile feature can help you analyze the execution plan, bottlenecks, and resource consumption of SQL statements, thereby helping you optimize the performance of SQL statements.

| username: vcdog | Original post link

Could it be related to these parameters?

| username: Hacker_ufuLjDKs | Original post link

First, check if the query limit parameter value is set too high as per your cousin’s suggestion. If it exceeds the allowable value of the TiDB instance, it becomes meaningless, and setting it would be no different from not setting it at all.

Secondly, as you mentioned in your reply, I think it’s also possible. Since you have a production environment, you definitely have a testing environment. Try the following in the testing environment:

  • The standalone tidb_mem_quota_query
  • Testing in combination with parameters related to tmp.

Looking forward to your test results.

Lastly, I want to say, if the above two checks are done, could it be that the current version has resource setting flaws?
I see that the v7 version of TiDB has significantly enhanced resource control, claiming to be a milestone: v7 - Resource Group-based Resource Control Feature
It is a milestone for TiDB - the foundation for multi-tenancy.
Three ways to bind to a resource group: Bind users to a specific resource group through the CREATE USER or ALTER USER statement, ensuring user sessions are always constrained by set boundaries;
Set the current session’s resource group through SET RESOURCE GROUP;
Set the current statement’s resource group through the Hint RESOURCE_GROUP().

| username: 我是咖啡哥 | Original post link

In versions prior to v6.5.0, this variable was used to set the memory usage limit for a single query.

In versions v6.5.0 and later, this variable is used to set the overall memory usage limit for a single session. If the memory usage during the execution of a session exceeds this threshold, it will trigger the behavior specified in the system variable tidb_mem_oom_action.

Have you set tidb_mem_oom_action?
If the variable value is LOG, then when the memory usage of an SQL exceeds a certain threshold (controlled by the session variable tidb_mem_quota_query), the SQL will continue to execute, but TiDB will print a LOG in the log file.

| username: BraveChen | Original post link

tidb_enable_rate_limit_action

Note

This variable is enabled by default but may cause memory to be uncontrolled by tidb_mem_quota_query, thereby increasing the risk of OOM. Therefore, it is recommended to set this variable to OFF.

  • Scope: SESSION | GLOBAL
  • Default value: ON
  • This variable controls whether to enable dynamic memory control for data reading operators. By default, data reading operators use the maximum number of threads allowed by tidb_distsql_scan_concurrency to read data. When the memory usage of a single SQL statement exceeds tidb_mem_quota_query once, the data reading operator will stop one thread.
  • When the data reading operator is left with only one thread and the memory usage of a single SQL statement continues to exceed tidb_mem_quota_query, the SQL statement will trigger other memory control behaviors, such as spilling to disk.
| username: BraveChen | Original post link

Check this parameter. Recently, we encountered this issue, and the support ticket informed us that it was caused by this parameter.

| username: xingzhenxiang | Original post link

Directly use a script with simple commands, focusing on memory and time dimensions, and also recording slow SQL for easy reference.

for list in `/mysql5.7/bin/mysql -hyourip -p'yourpassword' -vvv -e "select id from INFORMATION_SCHEMA.processlist a where a.info is not null and (mem >=11474836480 or time >600);" |grep -Ev 'id|ID|iD|Id' |awk -F "|" '{print $2}'`
do 
  echo $list
  /mysql5.7/bin/mysql -hyourip -p'yourpassword' -vvv -e "select id,time,info,mem from INFORMATION_SCHEMA.processlist a where id=$list and a.info is not null;" >/killsqllog/`date +%s`.log
  /mysql5.7/bin/mysql -hyourip -p'yourpassword' -vvv -e "kill tidb $list;"
done;
| username: vcdog | Original post link

In the TiDB mutual assistance group, an expert pointed out and identified the cause of the problem.

It is indeed due to the setting of the tidb_mem_oom_action parameter. This variable parameter only takes effect when adjusted in the system parameter command line and does not support modification through the configuration file.

The modification method is as follows:

mysql> set global tidb_mem_oom_action='cancel';
Query OK, 0 rows affected (0.02 sec)

mysql> show variables like '%oom%';
+--------------------------------+--------+
| Variable_name                  | Value  |
+--------------------------------+--------+
| tidb_enable_tmp_storage_on_oom | ON     |
| tidb_mem_oom_action            | CANCEL |
+--------------------------------+--------+
2 rows in set (0.01 sec)

Then, when using a large SQL query, if it exceeds the mem-quota-query, it will throw an OOM.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.