Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb节点内存异常飙升导致EC2机器在内存满之后机器假死状态但是端口还存活,但是不能提供服务
【TiDB Usage Environment】Production environment
【TiDB Version】Upgraded from tidb5.4 to version 6.1
【Reproduction Path】Operations performed that led to the issue
【Encountered Issue: Phenomenon and Impact】We adjusted a parameter to tidb_mem_quota_query
= 10G. After the TiDB node’s memory was exhausted, we reverted the adjustment to its initial value. However, after reverting, one TiDB node’s memory continuously increases without any slow logs, error logs, or even many normal logs. The number of connections, latency, etc., are all stable.
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】
Based on your description, the issue might be due to high memory usage by the TiDB node. Here are some recommended measures:
-
Check if the TiDB version has any known memory leak issues. If so, upgrade to the latest version.
-
Inspect the TiDB configuration file to ensure parameters are correctly set, such as whether tidb_mem_quota_query
is properly configured.
-
Use the TiDB monitoring panel or Grafana to monitor TiDB’s memory usage. Identify which components are consuming a large amount of memory and check for any abnormal fluctuations in memory usage.
-
Adjust TiDB parameters, such as tidb_mem_quota_query
and tidb_mem_quota_query_max
, to control TiDB’s memory usage.
-
Adjust the number of TiDB connections to avoid excessive memory usage due to too many connections.
-
Optimize query statements and plans, such as using indexes and avoiding full table scans, to reduce memory usage.
-
If high memory usage by the TiDB node causes the machine to become unresponsive, consider restarting the TiDB node or using TiDB’s dynamic parameter adjustment feature to gradually adjust parameters and observe memory usage.
It is important to note that TiDB memory usage issues can be complex and require analysis based on specific circumstances. It is recommended to back up data before attempting to resolve the issue to prevent data loss. Additionally, when deploying TiDB, set parameters reasonably based on actual business needs and hardware configurations to avoid high memory usage and potential machine unresponsiveness.
Please provide the following: Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.