PD and TiDB Communication Continuously Reporting Errors

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: PD与Tidb通讯持续报错

| username: TiDBer_pcHqOtU9

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.4.3
[Reproduction Path] Default Installation
[Encountered Problem: Phenomenon and Impact] When querying the TiDB logs, I found that the Error in the image keeps appearing repeatedly, causing transaction lock timeouts occasionally when the application operates the database, which has a significant impact on the business. I am not sure if this error is causing the issue. Why do only 3.41 and 3.44 keep reporting errors among the 6 TiDB and 6 PD instances? Where should I start investigating?
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page

[Attachments: Screenshots/Logs/Monitoring]

| username: xfworld | Original post link

Have there been any changes to the network structure recently?

| username: TiDBer_pcHqOtU9 | Original post link

The original cluster was not set up reasonably, and recently a large-scale migration was carried out. We added machines for expansion and then scaled down all the old nodes. The IP addresses have indeed changed.

| username: xfworld | Original post link

The method of operation doesn’t seem quite right. You can check all the nodes through Prometheus to see if they meet your expectations. If not, you’ll need to put in a bit more effort.

| username: redgame | Original post link

It is recommended to manually check the network.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.