Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb 系统监控io满了是怎么回事呢可以从几个方面分析呢
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 6.1
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issues: Issue Symptoms and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]
What you mentioned about IO being full refers to IO utilization, right? It’s not scary if IO is full; you need to look at other metrics as well. If the latency is not high and execution is fast, it means the resources are being fully utilized.
Check if there are any slow queries.
It seems that the IO metrics of TiKV are maxed out, causing the overall speed of executing inserts to be very slow.
Here is a thought: consider where TiDB will use IO, and then check those points accordingly. Of course, first consider whether there are slow queries or large queries causing the issue, and then consider other factors.
If there are slow queries, is optimizing the SQL the only option? Or are there any tuning methods for TiDB?
If there are slow queries, should I optimize the SQL or are there performance parameters in TiDB that can be optimized? I’m a newbie and have just started with TiDB.
Start from multiple aspects and investigate one by one
Do you have any optimization materials?
You can refer to the official documentation for performance tuning
You should first optimize the slowest SQL queries. The dashboard has a CPU usage ranking. Optimize them one by one according to the usage. Alternatively, you can refuse to execute long-running SQL queries to ensure cluster stability, such as limiting the execution of SQL queries that take more than 60 seconds.