TiKV log indicates PD worker send latency inspector failed

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv日志提示pd worker send latency inspecter failed

| username: Hacker007

Today, I found that one of the three TiKV nodes was down. The logs kept showing “pd worker send latency inspector failed.” I then tried to scale out a new node, but the same exception occurred. The other two nodes are fine.

| username: songxuecheng | Original post link

  1. Check if there are any issues with PD IO.
  2. Send the complete TiKV logs.
  3. Send the TiKV monitoring data.
| username: Hacker007 | Original post link

The logs of this node are constantly showing this line. I’m monitoring it, but I’m not sure which metric to look at.

| username: Hacker007 | Original post link

Now it prompts this again.

| username: songxuecheng | Original post link

Please send the complete monitoring.

| username: h5n1 | Original post link

Check the network status between PD and TiKV.

| username: Hacker007 | Original post link

This is related to GC. I saw an exception in the GC logs!

| username: Hacker007 | Original post link

I still need to study the monitoring metrics carefully…

| username: Hacker007 | Original post link

How to fix the disconnection between PD and TiKV?

| username: h5n1 | Original post link

Disconnected? Network down? If the network is up, it will reconnect automatically.

| username: Hacker007 | Original post link

Executing telnet works, ruling out network issues. There are two sets of clusters, old and new. The old one has no issues, but the new one, which expanded with two new TiKV nodes, is experiencing connectivity problems. It first shows as Disconnected and then down. Not sure if it’s being affected by the old cluster.

| username: h5n1 | Original post link

Are there two ports for telnetting into TiKV?

| username: Hacker007 | Original post link

Yes, when PD is in a telnet down state, the two ports of TiKV and the telnet PD in TiKV are both fine.

| username: Hacker007 | Original post link

The TiKV logs also continuously indicate: pd worker send latency inspector failed

| username: Min_Chen | Original post link

Hello, could you please check if there are any anomalies in the PD leader’s logs?

| username: Hacker007 | Original post link

Thank you for your attention. Restarting TiKV solved the issue, but later there was data inconsistency, which was resolved by restarting TiDB.

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.