Why did the TiKV leader suddenly decrease and then increase again? What are the reasons?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv的leader突然下降,有上升,啥原因

| username: 路在何chu

[TiDB Usage Environment] Production Environment
[TiDB Version]
4.0.13
[Reproduction Path] Operations performed that led to the issue
The number of connections suddenly increased, and the number of leaders on one of the TiKV nodes suddenly dropped.
[Encountered Issue: Symptoms and Impact]



| username: 路在何chu | Original post link

The distribution of leaders is also uneven.

| username: tidb菜鸟一只 | Original post link

Check the network to see if this node experienced a brief disconnection at the corresponding time.

| username: 像风一样的男子 | Original post link

Did a KV node restart?

| username: 路在何chu | Original post link

The network is not disconnected.

| username: 路在何chu | Original post link

And the latency is also very high.

| username: 路在何chu | Original post link

Indeed, it was restarted.

| username: 像风一样的男子 | Original post link

It is recommended to set up monitoring and alerts for TiDB so that you can be promptly informed when a node restarts.

| username: 路在何chu | Original post link

The node restarted without even writing a log.

| username: 路在何chu | Original post link

However, there’s one thing I don’t understand. When TiKV restarts, why does the number of TiDB connections suddenly increase so much?

| username: 像风一样的男子 | Original post link

When the KV restarts, the leader of the region will switch, causing SQL execution errors and retries, and the number of connections will definitely increase.

| username: 路在何chu | Original post link

Oh, okay, I’ve learned something new. Thank you.

| username: Fly-bird | Original post link

Personally, I think this doesn’t need to be addressed. As long as the nodes are not faulty and the leader is not frequently switched, there should be no problem.

| username: zhanggame1 | Original post link

Check the TiKV and PD logs from that time point; they should indicate where the problem is.

| username: 路在何chu | Original post link

The TiKV logs have not been updated and still show yesterday’s information. It is likely due to the version being too low.

| username: 路在何chu | Original post link

I can only see the system log with restart information.

| username: 路在何chu | Original post link

But I can’t see anything meaningful, not sure why it restarted.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.