There has always been this alert, ticdc_memory_abnormal

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 一直有这个告警,ticdc_memory_abnormal

| username: 路在何chu

【TiDB Usage Environment】Production Environment
4013
【Reproduction Path】What operations were performed when the issue occurred

【Encountered Issue: Issue Phenomenon and Impact】
There is still a lot of memory left. I checked the meaning of this alert, which indicates that TiCDC heap memory usage exceeds 10 GiB. Does it have any significance? My CDC server has 62G of memory, and 32G is used. Can I increase this alert value, and what are the risks?

| username: 路在何chu | Original post link

To add, I didn’t see any anomalies in the logs either.

| username: 小龙虾爱大龙虾 | Original post link

It doesn’t make much sense, you can increase it :grinning:

| username: 路在何chu | Original post link

How much have you all set? As long as the server memory is sufficient, can the alert threshold be increased?

| username: Fly-bird | Original post link

The alert is meaningless as long as the actual memory usage does not exceed the limit.

| username: 像风一样的男子 | Original post link

Go to the configuration file under prometheus at tidb-deploy/prometheus-8249/conf to modify the alert rules.

| username: 小龙虾爱大龙虾 | Original post link

Yes, as long as it doesn’t OOM, it’s fine.

| username: andone | Original post link

Please post the logs for review.

| username: swino | Original post link

Here are some possible causes for the “ticdc_memory_abnormal” alert:

  1. Insufficient memory allocation for TiCDC during runtime, unable to meet the demands of large-scale data changes conversion and transmission.
  2. Too many database changes being monitored by TiCDC, with limited processing capacity.
  3. Memory leaks or other memory management issues within TiCDC.

Here are some potential solutions:

  1. Increase the memory quota for TiCDC during runtime to enhance its ability to handle data changes.
  2. Adjust the scope and frequency of data capture by TiCDC to reduce its memory burden, ensuring better handling of data changes.
  3. Check TiCDC logs to identify potential memory issues and bottlenecks, and attempt to modify TiCDC configurations or resolve issues.
  4. Update TiCDC to the latest version to take advantage of the latest optimizations and updates.

The “ticdc_memory_abnormal” alert requires attention to TiCDC’s memory usage. By checking TiCDC logs and configurations, appropriately adjusting runtime parameters, you can ensure the stability and accuracy of TiCDC.

| username: 大飞哥online | Original post link

Go to the configuration file tidb-deploy/prometheus-8249/conf to change the alert rules.