[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.1
[Encountered Problem: Phenomenon and Impact] High memory consumption
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
I found that TiDB has been consuming a lot of memory these past few days. I just doubled the memory, and the usage rate is almost at 90% again. I don’t know where the problem is.
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots / Logs / Monitoring]
Please post your issue according to the problem template~
At least mention whether this is read traffic or write traffic (most likely read). Also, approximately how many bytes per minute is the yellow part?
Check the tables involved in the yellow part to see what SQL queries were executed. Are there any slow queries with large table scans? Pull out the execution plan and analyze it.
This picture is completely unclear. I think for mixed deployment, we first need to see how much memory each component occupies, such as how much memory TiDB and TiKV use, and then see which one uses more. Then we can analyze further.
We only have a tiny amount of data, and the backup is less than 1GB. Isn’t 4C16G for each of the three nodes enough? If we switch to MySQL, I think a single instance with 4C8G can handle it, right? Why does TiDB consume so many resources?
If the problem can be solved by MySQL, there’s no need to use TiDB.
When resources are insufficient, PD should not be placed together with TiKV.
In a mixed deployment, if parameters are not adjusted, each component will assume it has exclusive use of the entire machine. Therefore, in a mixed deployment, parameters must be recalculated.
The parameters mentioned in the documentation need to be properly allocated, otherwise, it will be difficult to ensure cluster stability.
You need to allocate the parameters mentioned in the documentation properly; otherwise, it will be difficult to ensure cluster stability (e.g., OOM issues). This is because TiDB, PD, and TiKV will affect each other, leading to resource contention among the components.
Based on my years of experience, this is most likely due to a missing index on a table or an invalid index, leading to frequent full table scans and causing table-level hotspots. You should check for slow SQL in this case.