The number of leaders on the TiKV node is 0

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv节点leader数量为0

| username: porpoiselxj

【TiDB Environment】Test/
【TiDB Version】v6.1.1
【Reproduction Path】No special operations performed
【Encountered Problem: Phenomenon and Impact】
One TiKV node has a leader count of 0, but the instance status is normal, and the region count is also normal. The TiKV log shows “call CheckLeader failed”. Detailed log is as follows:

| username: hey-hoho | Original post link

According to this manual, troubleshoot first:

| username: porpoiselxj | Original post link

The default policy is normal, no evict-leader-scheduler was found, and no special labels were set. It was discovered that the leader_score of this node is 0:

| username: redgame | Original post link

To check the replica distribution, use pd-ctl -u http://<PD_IP>:<PD_Port> store.

| username: xfworld | Original post link

leader_score: 0

The other KV nodes have a standard score, not 0, right?

| username: porpoiselxj | Original post link

All are standard nodes, without any additional configuration.

| username: porpoiselxj | Original post link

The screenshot above shows the result of using this command.

| username: xfworld | Original post link

Check the distribution of regions through Grafana.

The document recommended by the second floor can be used for troubleshooting.

| username: porpoiselxj | Original post link

The distribution of regions is very balanced. Except for this node, the leader distribution on other nodes is also very balanced. I went through the document from the second-floor teacher, but couldn’t find the issue.

| username: yiduoyunQ | Original post link

PD leader log search keyword “detected slow store, start to evict leaders”

| username: xfworld | Original post link

Search for “evict leader” in the TiKV logs. If you find logs containing “evict leader,” it indicates that the TiKV node has been evicted as the leader.

grep "evict leader" /path/to/tikv.log

Please check and confirm…

| username: porpoiselxj | Original post link

The above keywords could not be found in the PD leader’s logs.

| username: h5n1 | Original post link

Is the slowscore of the normal node 1? If not, it means that TiDB detected performance issues with this TiKV and thus evicted the leader.

| username: porpoiselxj | Original post link

In the logs of the TiKV node with leader 0, the above keywords could not be found. It seems that the leader on this node was evicted in a very short time.

| username: porpoiselxj | Original post link

All nodes have this value set to 1.

| username: 裤衩儿飞上天 | Original post link

Search for “evict leader” in the TiKV logs.

| username: porpoiselxj | Original post link

I searched again, and indeed there is nothing. The leader was evicted around 7:40 AM on the 4th. I checked all the logs from the 4th, and there are no such keywords.

| username: 裤衩儿飞上天 | Original post link

Upload the logs of this node, including those from the 4th.

| username: h5n1 | Original post link

Check the PD leader’s logs for this time as well.

| username: porpoiselxj | Original post link

This is a bit tricky, let me see if I can figure something out.