Query Selection for Logging TiDB OOM to Temporary File running.sql

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB oom 记录到临时文件的 running.sql 的查询选择

| username: TiDBer_yyy

【TiDB Usage Environment】Production Environment
【TiDB Version】5.0.4
【Encountered Problem: Phenomenon and Impact】
There is a significant difference in memory usage between the running.sql recorded in TiDB OOM and the slow query SQL.

Maximum memory in running.sql is over 10G

## 
mem_max: 11538442372 Bytes (10.7 GB)

Memory consumption of the same query between OOMs is over 20G

Questions:

  1. What criteria are used to select the top 10 SQLs recorded in OOM?

  2. Does the max_mem in the running.sql file and the max_mem in the slow query refer to the memory occupied by TiDB or TiKV?

  3. The memory distribution in the database shows many spikes as shown in the figure below. How does tidb-server manage memory?

| username: tidb狂热爱好者 | Original post link

You didn’t mention any version. The memory management of 6.5 is different from that of 5.4. Generally, we limit the execution count of individual SQL queries.

| username: Lystorm | Original post link

How is it deployed? Are TiDB and TiKV on the same node?

| username: TiDBer_yyy | Original post link

The old version is 5.0.4.
Recently, there have been OOM incidents, so we need to carefully investigate TiDB memory management.

| username: TiDBer_yyy | Original post link

Independently deployed.

| username: Lystorm | Original post link

Is your screenshot monitoring tidbserver? Check the memory monitoring of tikv.

| username: TiDBer_yyy | Original post link

With 25 TiKVs, the performance of TiKV is quite good. No significant memory jitter has been observed.

| username: tidb狂热爱好者 | Original post link

Version 5.0 has a lot of OOM issues, mainly due to unreasonable SQL development. You can solve it by configuring a limit with max_execution_time. Refer to this document for details: TiDB 内存控制文档 | PingCAP 文档中心

| username: tidb狂热爱好者 | Original post link

OOM means everyone dies together, the TiDB server stops and restarts. If you have configured limits, the SQL that causes high memory usage cannot be executed. Notify the developers to make modifications later. This is where the difference lies.

| username: tidb狂热爱好者 | Original post link

TiKV will not experience OOM (Out of Memory) issues. The memory bottleneck is on the TiDB server. Increasing the memory on the TiDB server can also alleviate the problem.

| username: h5n1 | Original post link

Official OOM video

| username: TiDBer_yyy | Original post link

This has already been adjusted. ld wants a more detailed memory allocation method.

| username: buddyyuan | Original post link

Try limiting the concurrency and then set tidb_enable_rate_limit_action to false to see what happens.
What does your SQL look like?

| username: buddyyuan | Original post link

→ TiDB’s memory.

| username: buddyyuan | Original post link

The memory distribution of the database has many spikes as shown in the figure. How does tidb-server manage memory?

You can check the explanation of the tidb_enable_rate_limit_action variable, which is a flow control measure.

| username: TiDBer_yyy | Original post link

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.