PD process won't start, please help check the reason?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd进程起不来,帮忙看下什么原因?

| username: 月明星稀

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.0
[Reproduction Path] clean --all or --data and restart are ineffective
[Encountered Problem: Problem Phenomenon and Impact]
The entire cluster’s pd-server cannot start, the process starts and then exits, and there is no issue with the cluster network.
Error logs are as follows:
[2024/06/03 19:20:55.408 +08:00] [ERROR] [etcdutil.go:126] [“load from etcd meet error”] [key=/pd/cluster_id] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/06/03 19:20:55.408 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:21:11.019 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:21:11.019 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:21:49.419 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:22:19.211 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:22:29.211 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:22:47.524 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:23:21.706 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:23:36.970 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:23:49.810 +08:00] [ERROR] [etcdutil.go:126] [“load from etcd meet error”] [key=/pd/cluster_id] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/06/03 19:23:49.810 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:24:05.254 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:24:05.254 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:24:24.007 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:24:39.510 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:24:39.510 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:24:39.510 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:25:03.310 +08:00] [ERROR] [etcdutil.go:126] [“load from etcd meet error”] [key=/pd/cluster_id] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/06/03 19:25:03.310 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:25:18.715 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:25:35.606 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:25:50.968 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:25:50.968 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:25:50.968 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:26:14.207 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:26:29.990 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:26:29.990 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:26:29.999 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:26:45.910 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:27:01.528 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:27:18.508 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:27:33.826 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:27:33.835 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:27:52.006 +08:00] [ERROR] [etcdutil.go:126] [“load from etcd meet error”] [key=/pd/cluster_id] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”]
[2024/06/03 19:27:52.007 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdKVGet]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:28:07.513 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:28:07.526 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:28:20.306 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:28:35.732 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:28:35.733 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:28:35.733 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:29:00.511 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:29:18.490 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:29:18.500 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:29:45.807 +08:00] [FATAL] [main.go:117] [“run server failed”] [error=“[PD:etcd:ErrEtcdMemberList]context deadline exceeded: context deadline exceeded”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:117\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2024/06/03 19:30:01.278 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[2024/06/03 19:30:01.288 +08:00] [ERROR] [etcdutil.go:71] [“failed to get cluster from remote”] [error=“[PD:etcd:ErrEtcdGetCluster]could not retrieve cluster information from the given URLs: could not retrieve cluster information from the given URLs”]
[202

| username: Miracle | Original post link

Adjust the log level to info, and then send the PD logs.

| username: Billmay表妹 | Original post link

Take a look at the deployment configuration: how many PD, how many TiDB, and how many TiKV.

| username: 月明星稀 | Original post link

Thanks for helping to check:
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:41] [“Welcome to Placement Driver (PD)”]
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:42] [PD] [release-version=v6.5.0]
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:43] [PD] [edition=Community]
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:44] [PD] [git-hash=d1a4433c3126c77fb2d5bb5720eefa0f2e05c166]
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:45] [PD] [git-branch=heads/refs/tags/v6.5.0]
[2024/06/03 19:29:15.930 +08:00] [INFO] [util.go:46] [PD] [utc-build-time=“2022-12-16 08:19:07”]
[2024/06/03 19:29:15.930 +08:00] [INFO] [metricutil.go:83] [“disable Prometheus push client”]
[2024/06/03 19:29:15.930 +08:00] [INFO] [server.go:247] [“PD Config”] [config=“{"client-urls":"https://0.0.0.0:2379","peer-urls":"https://0.0.0.0:2380","advertise-client-urls":"https://152.122.65.30:2379","advertise-peer-urls":"https://152.122.65.30:2380","name":"pd-152.122.65.30-2379","data-dir":"/cache8/pd-data","force-new-cluster":false,"enable-grpc-gateway":true,"initial-cluster":"pd-152.122.65.30-2379=https://152.122.65.30:2380,pd-152.122.65.32-2379=https://152.122.65.32:2380,pd-152.122.65.35-2379=https://152.122.65.35:2380,pd-58.220.65.85-2379=https://58.220.65.85:2380","initial-cluster-state":"new","initial-cluster-token":"pd-cluster","join":"","lease":3,"log":{"level":"info","format":"text","disable-timestamp":false,"file":{"filename":"/usr/local/tikvserver/pd-2379/log/pd.log","max-size":300,"max-days":0,"max-backups":0},"development":false,"disable-caller":false,"disable-stacktrace":false,"disable-error-verbose":true,"sampling":null,"error-output-path":""},"tso-save-interval":"3s","tso-update-physical-interval":"50ms","enable-local-tso":false,"metric":{"job":"pd-152.122.65.30-2379","address":"","interval":"15s"},"schedule":{"max-snapshot-count":64,"max-pending-peer-count":64,"max-merge-region-size":20,"max-merge-region-keys":200000,"split-merge-interval":"1h0m0s","swtich-witness-interval":"1h0m0s","enable-one-way-merge":"false","enable-cross-table-merge":"true","patrol-region-interval":"10ms","max-store-down-time":"30m0s","max-store-preparing-time":"48h0m0s","leader-schedule-limit":20,"leader-schedule-policy":"count","region-schedule-limit":2048,"replica-schedule-limit":64,"merge-schedule-limit":8,"hot-region-schedule-limit":20,"hot-region-cache-hits-threshold":1,"store-limit":{},"tolerant-size-ratio":0,"low-space-ratio":0.8,"high-space-ratio":0.7,"region-score-formula-version":"v2","scheduler-max-waiting-operator":5,"enable-remove-down-replica":"true","enable-replace-offline-replica":"true","enable-make-up-replica":"true","enable-remove-extra-replica":"true","enable-location-replacement":"true","enable-debug-metrics":"false","enable-joint-consensus":"true","enable-tikv-split-region":"true","schedulers-v2":[{"type":"balance-region","args":null,"disable":false,"args-payload":""},{"type":"balance-leader","args":null,"disable":false,"args-payload":""},{"type":"hot-region","args":null,"disable":false,"args-payload":""},{"type":"split-bucket","args":null,"disable":false,"args-payload":""}],"schedulers-payload":null,"store-limit-mode":"manual","hot-regions-write-interval":"10m0s","hot-regions-reserved-days":7,"enable-diagnostic":"false","enable-witness":"false"},"replication":{"max-replicas":3,"location-labels":"","strictly-match-label":"false","enable-placement-rules":"true","enable-placement-rules-cache":"false","isolation-level":""},"pd-server":{"use-region-storage":"true","max-gap-reset-ts":"24h0m0s","key-type":"table","runtime-services":"","metric-storage":"","dashboard-address":"auto","trace-region-flow":"true","flow-round-by-digit":3,"min-resolved-ts-persistence-interval":"1s"},"cluster-version":"0.0.0","labels":{},"quota-backend-bytes":"8GiB","auto-compaction-mode":"periodic","auto-compaction-retention-v2":"1h","TickInterval":"500ms","ElectionInterval":"3s","PreVote":true,"max-request-bytes":157286400,"security":{"cacert-path":"/usr/local/tikvserver/pd-2379/tls/ca.crt","cert-path":"/usr/local/tikvserver/pd-2379/tls/pd.crt","key-path":"/usr/local/tikvserver/pd-2379/tls/pd.pem","cert-allowed-cn":null,"SSLCABytes":null,"SSLCertBytes":null,"SSLKEYBytes":null,"redact-info-log":false,"encryption":{"data-encryption-method":"plaintext","data-key-rotation-period":"168h0m0s","master-key":{"type":"plaintext","key-id":"","region":"","endpoint":"","path":""}}},"label-property":null,"WarningMsgs":null,"DisableStrictReconfigCheck":false,"HeartbeatStreamBindInterval":"1m0s","LeaderPriorityCheckInterval":"1m0s","dashboard":{"tidb-cacert-path":"","tidb-cert-path":"","tidb-key-path":"","public-path-prefix":"","internal-proxy":false,"enable-telemetry":true,"enable-experimental":false},"replication-mode":{"replication-mode":"majority","dr-auto-sync":{"label-key":"","primary":"","dr":"","primary-replicas":0,"dr-replicas":0,"wait-store-timeout":"1m0s","pause-region-split":"false"}}}”]
[2024/06/03 19:29:15.938 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/pd/api/v1]
[2024/06/03 19:29:15.939 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/pd/api/v2/]
[2024/06/03 19:29:15.939 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/swagger/]
[2024/06/03 19:29:15.939 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/autoscaling]
[2024/06/03 19:29:15.939 +08:00] [INFO] [distro.go:51] [“Using distribution strings”] [strings={}]
[2024/06/03 19:29:15.942 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/dashboard/api/]
[2024/06/03 19:29:15.942 +08:00] [INFO] [server.go:222] [“register REST path”] [path=/dashboard/]
[2024/06/03 19:29:15.942 +08:00] [INFO] [etcd.go:117] [“configuring peer listeners”] [listen-peer-urls=“[https://0.0.0.0:2380]”]
[2024/06/03 19:29:15.942 +08:00] [INFO] [etcd.go:474] [“starting with peer TLS”] [tls-info=“cert = /usr/local/tikvserver/pd-2379/tls/pd.crt, key = /usr/local/tikvserver/pd-2379/tls/pd.pem, trusted-ca = /usr/local/tikvserver/pd-2379/tls/ca.crt, client-cert-auth = true, crl-file = “] [cipher-suites=””]
[2024/06/03 19:29:15.942 +08:00] [INFO] [systimemon.go:28] [“start system time monitor”]
[2024/06/03 19:29:15.942 +08:00] [INFO] [etcd.go:127] [“configuring client listeners”] [listen-client-urls=“[https://0.0.0.0:2379]”]
[2024/06/03 19:29:15.942 +08:00] [INFO] [etcd.go:611] [“pprof is enabled”] [path=/debug/pprof]
[2024/06/03 19:29:15.943 +08:00] [INFO] [etcd.go:305] [“starting an etcd server”] [etcd-version=3.4.21] [git-sha=“Not provided (use ./build instead of go build)”] [go-version=go1.19.3] [go-os=linux] [go-arch=amd64] [max-cpu-set=80] [max-cpu-available=80] [member-initialized=true] [name=pd-152.122.65.30-2379] [data-dir=/cache8/pd-data] [wal-dir=] [wal-dir-dedicated=] [member-dir=/cache8/pd-data/member] [force-new-cluster=false] [heartbeat-interval=500ms] [election-timeout=3s] [initial-election-tick-advance=true] [snapshot-count=100000] [snapshot-catchup-entries=5000] [initial-advertise-peer-urls=“[https://152.122.65.30:2380]”] [listen-peer-urls=“[https://0.0.0.0:2380]”] [advertise-client-urls=“[https://152.122.65.30:2379]”] [listen-client-urls=“[https://0.0.0.0:2379]”] [listen-metrics-urls=“”] [cors=“[]“] [host-whitelist=”[]”] [initial-cluster=] [initial-cluster-state=new] [initial-cluster-token=] [quota-backend-bytes=8589934592] [max-request-bytes=157286400] [max-concurrent-streams=4294967295] [pre-vote=true] [initial-corrupt-check=false] [corrupt-check-time-interval=0s] [auto-compaction-mode=periodic] [auto-compaction-retention=1h0m0s] [auto-compaction-interval=1h0m0s] [discovery-url=] [discovery-proxy=]
[2024/06/03 19:29:15.943 +08:00] [INFO] [backend.go:80] [“opened backend db”] [path=/cache8/pd-data/member/snap/db] [took=465.904µs]
[2024/06/03 19:29:15.952 +08:00] [INFO] [raft.go:586] [“restarting local member”] [cluster-id=821152945a57894b] [local-member-id=6443f7aacb544c75] [commit-index=107]
[2024/06/03 19:29:15.953 +08:00] [INFO] [raft.go:1523] [“6443f7aacb544c75 switched to configuration voters=()”]
[2024/06/03 19:29:15.953 +08:00] [INFO] [raft.go:706] [“6443f7aacb544c75 became follower at term 16”]
[2024/06/03 19:29:15.953 +08:00] [INFO] [raft.go:389] [“newRaft 6443f7aacb544c75 [peers: , term: 16, commit: 107, applied: 0, lastindex: 108, lastterm: 16]”]
[2024/06/03 19:29:15.954 +08:00] [INFO] [quota.go:126] [“enabled backend quota”] [quota-name=v3-applier] [quota-size-bytes=8589934592] [quota-size=“8.6 GB”]
[2024/06/03 19:29:15.955 +08:00] [INFO] [server.go:816] [“starting etcd server”] [local-member-id=6443f7aacb544c75] [local-server-version=3.4.21] [cluster-version=to_be_decided]
[2024/06/03 19:29:15.955 +08:00] [INFO] [server.go:704] [“starting initial election tick advance”] [election-ticks=6]
[2024/06/03 19:29:15.956 +08:00] [INFO] [raft.go:1523] [“6443f7aacb544c75 switched to configuration voters=(1282144946999282109)”]
[2024/06/03 19:29:15.956 +08:00] [INFO] [cluster.go:392] [“added member”] [cluster-id=821152945a57894b] [local-member-id=6443f7aacb544c75] [added-peer-id=11cb1809447449bd] [added-peer-peer-urls=“[https://152.122.65.35:2380]”]
[2024/06/03 19:29:15.956 +08:00] [INFO] [peer.go:128] [“starting remote peer”] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.956 +08:00] [INFO] [pipeline.go:71] [“started HTTP pipelining with remote peer”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.956 +08:00] [INFO] [stream.go:166] [“started stream writer with remote peer”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.956 +08:00] [INFO] [stream.go:166] [“started stream writer with remote peer”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.956 +08:00] [INFO] [peer.go:134] [“started remote peer”] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.957 +08:00] [INFO] [transport.go:327] [“added remote peer”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd] [remote-peer-urls=“[https://152.122.65.35:2380]”]
[2024/06/03 19:29:15.957 +08:00] [INFO] [stream.go:406] [“started stream reader with remote peer”] [stream-reader-type=“stream MsgApp v2”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.957 +08:00] [INFO] [raft.go:1523] [“6443f7aacb544c75 switched to configuration voters=(1282144946999282109 7224890540160207989)”]
[2024/06/03 19:29:15.957 +08:00] [INFO] [stream.go:406] [“started stream reader with remote peer”] [stream-reader-type=“stream Message”] [local-member-id=6443f7aacb544c75] [remote-peer-id=11cb1809447449bd]
[2024/06/03 19:29:15.957 +08:00] [INFO] [cluster.go:392] [“added member”] [cluster-id=821152945a57894b] [local-member-id=6443f7aacb544c75] [added-peer-id=6443f7aacb544c75] [added-peer-peer-urls=“[https://152.122.65.30:2380]”]
[2024/06/03 19:29:15.957 +08:00] [INFO] [raft.go:1523] [“6443f7aacb544c75 switched to configuration voters=(1282144946999282109 7224890540160207989 13449958727701477628)”]
[2024/06/03 19:29:15.957 +08:00] [INFO] [cluster.go:392] [“added member”] [cluster-id=821152945a57894b]

| username: 月明星稀 | Original post link

4 TiKV, 4 PD

| username: Miracle | Original post link

Check if the IP 58.220.65.85 is a PD pod or a PD service?

| username: tidb狂热爱好者 | Original post link

tiup cluster display tidb-test Look at the architecture and paste it here.

| username: 月明星稀 | Original post link

They are all physical machines. This IP also deploys PD and TiKV, and they belong to the same cluster.

| username: 月明星稀 | Original post link

tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.12.0/tiup-cluster display JS-yangzhou8-DX
Cluster type:       tidb
Cluster name:       JS-yangzhou8-DX
Cluster version:    v6.5.0
Deploy user:        tikvserver
SSH type:           builtin
TLS encryption:     enabled
CA certificate:     /root/.tiup/storage/cluster/clusters/JS-yangzhou8-DX/tls/ca.crt
Client private key: /root/.tiup/storage/cluster/clusters/JS-yangzhou8-DX/tls/client.pem
Client certificate: /root/.tiup/storage/cluster/clusters/JS-yangzhou8-DX/tls/client.crt
ID                  Role  Host          Ports        OS/Arch       Status  Data Dir                      Deploy Dir
--                  ----  ----          -----        -------       ------  --------                      ----------
152.122.65.30:2379   pd    152.122.65.30  2379/2380    linux/x86_64  Down    /cache8/pd-data               /usr/local/tikvserver/pd-2379
152.122.65.32:2379   pd    152.122.65.32  2379/2380    linux/x86_64  Down    /cache8/pd-data               /usr/local/tikvserver/pd-2379
152.122.65.35:2379   pd    152.122.65.35  2379/2380    linux/x86_64  Down    /cache8/pd-data               /usr/local/tikvserver/pd-2379
58.220.65.85:2379   pd    58.220.65.85  2379/2380    linux/x86_64  Down    /cache8/pd-data               /usr/local/tikvserver/pd-2379
152.122.65.30:20160  tikv  152.122.65.30  20160/20180  linux/x86_64  N/A     /cache8/tikv-data/tikv-20160  /usr/local/tikvserver/tikv-20160
152.122.65.32:20160  tikv  152.122.65.32  20160/20180  linux/x86_64  N/A     /cache8/tikv-data/tikv-20160  /usr/local/tikvserver/tikv-20160
152.122.65.35:20160  tikv  152.122.65.35  20160/20180  linux/x86_64  N/A     /cache8/tikv-data/tikv-20160  /usr/local/tikvserver/tikv-20160
58.220.65.85:20160  tikv  58.220.65.85  20160/20180  linux/x86_64  N/A     /cache8/tikv-data/tikv-20160  /usr/local/tikvserver/tikv-20160
| username: h5n1 | Original post link

Is iptables configured?

| username: mono | Original post link

You have executed clean all. Are you trying to destroy the cluster and set up a new one?

| username: 月明星稀 | Original post link

Not configured

| username: 月明星稀 | Original post link

Originally thought it was data corruption, tried to check the details, but it also failed and couldn’t execute.

| username: yulei7633 | Original post link

This PD has four nodes, which is not quite appropriate. It should be an odd number.

| username: lemonade010 | Original post link

No TiDB? All nodes are even numbers.

| username: Billmay表妹 | Original post link

It is recommended to separate the 3 PDs that are all placed on one machine.

| username: Billmay表妹 | Original post link

RawKV? Are you using only RawKV without TiDB?

It’s not recommended to use RawKV directly. RawKV is not the normal usage of TiDB, and if you encounter problems, others won’t be able to help you.

| username: TiDBer_小阿飞 | Original post link

This node failed to start up.

The error of etcd startup failure is mostly related to the data-dir. The information recorded in the data-dir does not match the information identified by the etcd startup options.

If this type of error can be resolved by modifying the startup parameters, that would be the best. In extreme cases, the solutions are:

One solution is to delete the data-dir file.
Another method is to copy the contents of the data-dir from other nodes, forcibly start it with the --force-new-cluster option, and then restore the cluster by adding new members.

You can try this in a test environment, but in a production environment, just pretend I didn’t say anything :joy:

| username: 月明星稀 | Original post link

How do you modify the startup parameters? If cleaning the data-dir can solve the issue, why can’t I start it even after cleaning the data with --data?

| username: 月明星稀 | Original post link

Three PDs are on different machines, not on the same machine.