Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: Tidb异常重启问题
【TiDB Usage Environment】Production Environment
【TiDB Version】V5.4
【Encountered Problem】TiDB server abnormal restart
【Reproduction Path】Query transaction data through Superset
【Problem Phenomenon and Impact】
Found abnormal memory usage of TiDB server, the operating system directly killed the TiDB server
Checked abnormal restart through dmesg
Memory control
How to control the memory of the TiDB server instance? To avoid abnormal restart of the TiDB server affecting the business.
The SQL has been found. But is there any way to avoid TiDB server restarts, or directly return an error message for similar SQL?
It’s an OOM issue. Optimize the SQL, remove those full table scans, and if optimization is not possible, set limits:
With these three configurations set properly, the upper limit of TiDB memory is basically determined. If it still exceeds, it’s likely due to an increase in the number of connections. If the number of connections is also surging, limit the max-connection as well.
The correct approach is to solve it from the business side: control the number of connections and the execution plan of the SQL. If it’s difficult to solve from the business side, the combination of the above parameters will at least prevent TiDB from going OOM.
Currently, I control SQL query memory and execution duration through oom-action, mem-quota-query, and max_execution_time. If a large SQL execution exceeds the execution duration, the SQL execution is directly terminated to ensure the stability of the TiDB server.
This topic will be automatically closed 60 days after the last reply. No new replies are allowed.