Abnormal TICDC Monitoring Metrics

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TICDC 监控指标异常

| username: TiDBer_yyy

Database version: 5.0.4

Additionally, the monitoring dashboard shows a lot of NoDATA issues:

The phenomenon is very similar to this post: TiCDC监控看板没有监控数据 - TiDB 的问答社区, followed the post’s operations, but no recovery.

| username: jansu-dev | Original post link

Did you create a changefeed? If not, many panels under the changefeed panel will be missing because no changefeed replication has been created.

| username: TiDBer_yyy | Original post link

Created before, now there are 5 or 6 changefeeds.

| username: jansu-dev | Original post link

Troubleshooting steps:

  1. Since there are metrics, check if they are in Prometheus.
  2. If not, check the Prometheus logs for clues, such as the port not being exposed or the configuration not being included in Prometheus at all.
  3. If they are present, then check if the expression in Grafana is correct.

Follow these steps one by one, and you will find the answer.

| username: TiDBer_yyy | Original post link

  1. Prometheus query using PromQL returns no data.
  1. Observed Prometheus logs are normal.
  2. Grafana logs show an error; after restarting, the error log could not find datasource: data source not found appears, similar to the issue described in grafana 很多alert 都在告警,Execution Error: Could not find datasource Data source not found 但是没达到报警阈值 - TiDB 的问答社区.
| username: jansu-dev | Original post link

Will reload restore according to the post?
It seems that prometheus didn’t go to ticdc to fetch data, persistence~

| username: TiDBer_yyy | Original post link

Reload Prometheus and Grafana again to restore.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.