Experts, my HAProxy + Keepalived proxy for TiDB might be experiencing network instability. The VIP switches between primary and backup quickly. Some nodes cache the ARP info of the backup, causing connection issues to TiDB. What could be the reason?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 大佬们,我的haproxy+keepalived代理的tidb,可能是网络不稳定 vip从主切到备,然后短时间又切回主上。 集群中有节点缓存了备的arp信息 导致连不上tidb,这个是什么原因?

| username: jaybing926

Experts, my HAProxy + Keepalived setup for TiDB might be experiencing network instability. The VIP switches from the primary to the backup and then quickly switches back to the primary. Some nodes in the cluster cache the ARP information of the backup, causing them to be unable to connect to TiDB. What could be the reason for this?

PS: In our cluster’s internal network, all servers are connected to a single/cascading switch. No gateway is configured.

Nodes can capture VRRP broadcast packets.

| username: TiDBer_pkQ5q1l0 | Original post link

Just use arping to reach the gateway, that’s how a large Layer 2 network works.

| username: jaybing926 | Original post link

Our internal network does not have a configured gateway, and the external network uses another external network card.
So how can we avoid this issue? We can’t manually update the ARP information every time a problem occurs, right?

| username: xingzhenxiang | Original post link

Refer to

| username: jaybing926 | Original post link

I have read this document, and the purpose of this operation is “to send a gateway update ARP information when switching VIPs, right?”
But our environment doesn’t have a gateway, what should we do in this situation? :crazy_face: :crazy_face:

| username: TiDBer_pkQ5q1l0 | Original post link

There must be a gateway message to go out, definitely.

| username: jaybing926 | Original post link

eth1 is an internal network without a gateway, and there is only one gateway for the external network card eth0.