Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: 升级TIDB集群后tidb一直报错 write: connection reset by peer

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version]
[Reproduction Path] What operations were performed to encounter the issue
[Encountered Issue: Problem Phenomenon and Impact]
Upgraded cluster version from v4.0.9 to v5.4.3. After the upgrade, the TiDB logs report a large number of errors:
[stack="github.com/pingcap/tidb/parser/terror.Log\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:307\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:516"]
[2023/04/21 11:06:12.400 +08:00] [ERROR] [terror.go:307] ["encountered error"] [error="write tcp 192.168.241.72:4000->192.168.241.55:21118: write: connection reset by peer"] [stack="github.com/pingcap/tidb/parser/terror.Log\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:307\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:516"]
[2023/04/21 11:06:12.444 +08:00] [ERROR] [terror.go:307] ["encountered error"] [error="write tcp 192.168.241.72:4000->192.168.241.55:21123: write: connection reset by peer"] [stack="github.com/pingcap/tidb/parser/terror.Log\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:307\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:516"]
[2023/04/21 11:06:12.507 +08:00] [ERROR] [terror.go:307] ["encountered error"] [error="write tcp 192.168.241.72:4000->192.168.241.54:40415: write: connection reset by peer"] [stack="github.com/pingcap/tidb/parser/terror.Log\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:307\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:516"]
[2023/04/21 11:06:12.519 +08:00] [ERROR] [terror.go:307] ["encountered error"] [error="write tcp 192.168.241.72:4000->192.168.241.54:40416: write: connection reset by peer"] [stack="github.com/pingcap/tidb/parser/terror.Log\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:307\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:516"]
I searched on asktug and found many similar issues. Most of them resolved it by disabling the load balancer’s health check, but there doesn’t seem to be a final solution. Some experts on asktug suggested modifying the HAProxy’s health check port, but they didn’t specify how to implement it. The official documentation also doesn’t explain the corresponding HAProxy configuration.
I didn’t encounter this issue in v4.0.9, but I did after upgrading to v5.4.3. I want to ask if this is a bug? Has it been resolved in v6.5.1? Because my target version is v6.5.1.
Below is my HAProxy configuration file, which is also based on the official configuration:
# cat /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
nbproc 10
daemon
stats socket /var/lib/haproxy/stats
defaults
log global
retries 2
timeout connect 2s
timeout client 30000s
timeout server 30000s
listen admin_stats
bind 192.168.241.54:18080
mode http
option httplog
maxconn 10
stats refresh 30s
stats uri /haproxy
stats realm HAProxy
stats auth admin:UXnxFu5Mxxxxxxxxxxxx
stats hide-version
stats admin if TRUE
listen tidb-xxxxx
bind 0.0.0.0:14000
mode tcp
balance leastconn
server tidb-71 192.168.241.71:4000 send-proxy check inter 2000 rise 2 fall 3
server tidb-72 192.168.241.72:4000 send-proxy check inter 2000 rise 2 fall 3
server tidb-73 192.168.241.73:4000 send-proxy check inter 2000 rise 2 fall 3