PD Error: No Valid Leader

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: PD报错 no valid leader

| username: 巨化斑鸠

【TiDB Usage Environment】Testing
【TiDB Version】v6.5.8
【Reproduction Path】Unclear
【Encountered Problem: Phenomenon and Impact】PD logs keep flooding, filling up the disk in one day, and then the PD process ends.
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】pd.log
[2024/05/24 14:38:01.277 +08:00] [WARN] [merge_checker.go:175] [“create merge region operator failed”] [error=“no valid leader”]
[2024/05/24 14:38:01.288 +08:00] [WARN] [merge_checker.go:175] [“create merge region operator failed”] [error=“no valid leader”]
[2024/05/24 14:38:01.289 +08:00] [WARN] [merge_checker.go:175] [“create merge region operator failed”] [error=“no valid leader”]
[2024/05/24 14:38:01.299 +08:00] [WARN] [merge_checker.go:175] [“create merge region operator failed”] [error=“no valid leader”]

How can I solve the “no valid leader” issue?

Logs are set to debug level, with a large number of “no replacement store” errors.

| username: 这里介绍不了我 | Original post link

Are PD and TiDB mixed deployment?

| username: Kongdom | Original post link

Is TiKV functioning properly without an effective leader?

| username: 巨化斑鸠 | Original post link

Yes, since there is both PD and TIDB SERVER on one machine.

| username: 这里介绍不了我 | Original post link

Was the machine’s load normal during the error?

| username: lemonade010 | Original post link

First, check if TiKV is functioning properly. It seems that when prompted to perform a region merge, the leader cannot be found.

| username: 巨化斑鸠 | Original post link

Test environment, basically idle load.

| username: 巨化斑鸠 | Original post link

tikv did not report an error.

| username: yytest | Original post link

The logs show a failure when attempting to create a merge region operator, with the error message “no valid leader.” This usually means there is no valid leader in the Raft consensus algorithm. Ensure that all nodes are configured correctly, especially configurations related to Raft, such as raft-base-tick-interval, raft-election-timeout-ticks, raft-max-size-per-msg, raft-max-inflight-msgs, etc.

| username: xfworld | Original post link

Find a normal PD node and switch the Leader.

Or clear the disk space and try restarting the cluster.

| username: 鱼跃龙门 | Original post link

Is there a problem with the network? Try clearing the disk. Are there any errors reported elsewhere?

| username: h5n1 | Original post link

Is this the only error? Is there any region information in the logs? Did you enable the placement rule? Use show placement to check.

| username: 小于同学 | Original post link

Has PD recovered?

| username: TiDBer_H5NdJb5Q | Original post link

Has a leader not been elected all this time? Is the network connected? Try clearing the disk and restarting.