Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tikv日志提示pd worker send latency inspecter failed
Today, I found that one of the three TiKV nodes was down. The logs kept showing “pd worker send latency inspector failed.” I then tried to scale out a new node, but the same exception occurred. The other two nodes are fine.
The logs of this node are constantly showing this line. I’m monitoring it, but I’m not sure which metric to look at.
Now it prompts this again.
Please send the complete monitoring.
Check the network status between PD and TiKV.
This is related to GC. I saw an exception in the GC logs!
I still need to study the monitoring metrics carefully…
How to fix the disconnection between PD and TiKV?
Disconnected? Network down? If the network is up, it will reconnect automatically.
Executing telnet works, ruling out network issues. There are two sets of clusters, old and new. The old one has no issues, but the new one, which expanded with two new TiKV nodes, is experiencing connectivity problems. It first shows as Disconnected and then down. Not sure if it’s being affected by the old cluster.
Are there two ports for telnetting into TiKV?
Yes, when PD is in a telnet down state, the two ports of TiKV and the telnet PD in TiKV are both fine.
The TiKV logs also continuously indicate: pd worker send latency inspector failed
Hello, could you please check if there are any anomalies in the PD leader’s logs?
Thank you for your attention. Restarting TiKV solved the issue, but later there was data inconsistency, which was resolved by restarting TiDB.
This topic will be automatically closed 60 days after the last reply. No new replies are allowed.