Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb 持续报警 tidb_tikvclient_backoff_seconds_count
[Test Environment for TiDB]
[TiDB Version] V6.1.2
[Reproduction Path] There is an alert for tidb_tikvclient_backoff_seconds_count almost every day.
[Encountered Issue: Phenomenon and Impact] Currently, there is no impact on the business.
[Resource Configuration]
[Attachment: Screenshot/Log/Monitoring]
I usually ignore and don’t handle warnings of this level. I remember there is an emergency level. Also, this alert says region miss, which should indicate a system failure. The region is constantly being scheduled.
Backoff is a retry mechanism. In TiDB, when the tikv-client sends some requests to the tikv-server and fails, it will perform a backoff retry.
Take a look at the logs for PD and TiKV. See what caused the issue.
TiDB Cluster Alert Rules | PingCAP Docs
The image you provided is not visible. Please provide the text you need translated.
I checked, and it seems that TiDB failed to access TiKV. There should be some data loss. The following alert says region miss.
My resource usage is very low, this kind of issue is very frustrating.
Last night you said I had your back, now you have to handle it yourself. I think there might have been a node restart or something that caused a region miss. Let’s wait for the expert to take a look. How do we remove this region?
You can check the monitoring and look at the usual data for this metric. Adjust the monitoring, as this metric definitely has values. You can reset a threshold that suits your system.