TiDB fails to start

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb启动不起来

| username: TiDBer_rYOSh9JN

【TiDB Usage Environment】Testing
【TiDB Version】
V6.5.0
【Reproduction Path】What operations were performed when the issue occurred
Single machine deployment, 3 TiKV nodes, unable to start after the machine’s IP address was changed
【Encountered Issue: Problem Phenomenon and Impact】
Unable to start TiDB
【Resource Configuration】Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
Unable to enter the dashboard, login prompt:
Sign in failed: No alive TiDB instance
【Attachments: Screenshots/Logs/Monitoring】
Timeout when using tiup to start, final error:
Starting component pd
Starting instance 10.11.31.237:2379
Start instance 10.11.31.237:2379 success
Starting component tikv
Starting instance 10.11.31.237:20162
Starting instance 10.11.31.237:20160
Starting instance 10.11.31.237:20161
Start instance 10.11.31.237:20161 success
Start instance 10.11.31.237:20162 success
Start instance 10.11.31.237:20160 success
Starting component tidb
Starting instance 10.11.31.237:4000

Error: failed to start: failed to start tidb: failed to start: 10.11.31.237 tidb-4000.service, please check the instance’s log(/opt/apps/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s

tidb.log repeatedly reports this error

[2024/03/14 19:01:57.412 +08:00] [WARN] [backoff.go:158] [“pdRPC backoffer.maxSleep 40000ms is exceeded, errors:\nregion not found for key "7480000000000000125F6980000000000000010173797374656D5F74FF7A00000000000000F8" at 2024-03-14T19:01:50.298878054+08:00\nregion not found for key "7480000000000000125F6980000000000000010173797374656D5F74FF7A00000000000000F8" at 2024-03-14T19:01:52.646599719+08:00\nregion not found for key "7480000000000000125F6980000000000000010173797374656D5F74FF7A00000000000000F8" at 2024-03-14T19:01:55.010713466+08:00\nlongest sleep type: pdRPC, time: 41591ms”]
[2024/03/14 19:01:57.412 +08:00] [FATAL] [terror.go:300] [“unexpected error”] [error=“[tikv:9001]PD server timeout”] [stack=“github.com/pingcap/tidb/parser/terror.MustNil\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:300\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:315\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:214\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”] [stack=“github.com/pingcap/tidb/parser/terror.MustNil\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/parser/terror/terror.go:300\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:315\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:214\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]

| username: TiDBer_rYOSh9JN | Original post link

Is there an expert here? I need help…

| username: changpeng75 | Original post link

Are all configurations changed after changing the IP from TiDB Server to PD?

| username: Jasper | Original post link

The meta file hasn’t been changed, right? Refer to this link for the operation: 专栏 - 机房搬迁更改集群IP | TiDB 社区

| username: 随缘天空 | Original post link

What changes were made to the configuration file after the IP was changed? Let’s check the configuration.

| username: zhaokede | Original post link

Timed out waiting, it should be that the configuration is not complete. Please check carefully.

| username: Jellybean | Original post link

Is the PD cluster functioning normally? There is a timeout when accessing PD here. What is your procedure for changing the IP?

| username: Jellybean | Original post link

It is recommended to first roll back the operation and use the original IP to restore the cluster.

Then, it is suggested to use the method of expanding with the new IP and then shrinking the old IP to achieve online replacement of the entire cluster’s IP.

| username: YuchongXU | Original post link

Suggestions for scaling up and down

| username: zhanggame1 | Original post link

Don’t use a real IP for single-machine deployment. Refer to my setup using the 127.0.0.1 address; once installed, other machines can also connect. Otherwise, changing the IP is very troublesome.

| username: redgame | Original post link

What changes were made?

| username: ffeenn | Original post link

Check the PD status and review the error logs.

| username: DBAER | Original post link

It looks like getting PD timed out.

| username: tidb菜鸟一只 | Original post link

Changing the IP is not that simple. You need to refer to Column - Data Center Migration and Changing Cluster IP | TiDB Community for handling it.

| username: gary | Original post link

The meta.yaml information might not have been modified, please check.

| username: Soysauce520 | Original post link

You can refer to this article: 专栏 - 现网修改TiDB集群IP和端口 | TiDB 社区.

| username: tony5413 | Original post link

Refer to TiUP Modify Cluster IP (Based on Version V6)

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.