Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: TiKV内存过高告警,CDC内存占用过高排查
【TiDB Usage Environment】Production Environment
【TiDB Version】5.0.5
【Encountered Problem】High memory usage traced to excessive CDC memory consumption
-
Memory Distribution
-
CDC component not enabled in the cluster
-
Historical CDC memory usage was 4G, increased to 17.2G in the morning
-
No excessive memory usage slow queries during memory fluctuation (maximum memory 256kb)
【Problem Phenomenon and Impact】
Following the steps in 专栏 - TiKV主要内存结构和OOM排查总结 | TiDB 社区, it was found that CDC memory usage was excessive.
Questions:
- How to resolve excessive CDC memory usage?
- Why does CDC memory usage suddenly spike?
Restarting should solve the problem.
What caused it, boss? What is the principle behind being able to recover by restarting?
It might be related to the characteristics of Go. I’m not sure about the specifics. When I encounter TiKV using too much memory, I just restart it. Since it is distributed, it won’t cause failures.
Got it, thank you, master!
This is the memory usage of TiKV. Go check the server.
Background:
The original server memory usage was as expected.
This morning, there was a memory usage fluctuation. Upon investigation, it was found that at this time, CDC was using too much memory.
From the context, it can be seen that both CDC and TiKV memory consumption have increased. However, it is currently unclear whether the changes in TiKV data volume are causing certain actions in TiCDC, or if certain actions in TiCDC are leading to increased memory consumption in TiKV.
Questions:
- How to solve the issue of excessive CDC memory usage?
- Why is there a sudden increase in CDC memory usage?
To address these two questions, you need to look at what operations are being performed in the ticdc.log and the dashboard. Even if you restart, the corresponding operations will not stop due to the restart; they will only temporarily release memory due to process interruption. In short, the known conditions for this issue are insufficient, and it is best to have a clinic. However, if that is not convenient, you can export the dashboard and logs.
Master, currently the TiDB cluster has not deployed TiCDC, which feels very strange.
I don’t understand, then why is the topic → TiKV memory usage too high alert, CDC memory usage too high investigation?
I understand that now we want to check why TiKV memory usage is too high? Is it related to CDC?
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.