"failed to start node: Other(\"[components/pd_client/src/util.rs:954]: duplicated store address: id:11001

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ["failed to start node: Other("[components/pd_client/src/util.rs:954]: duplicated store address: id:11001

| username: Quino

[TiDB Usage Environment] Production Environment
[TiDB Version] 7.5.0
[Reproduction Path] Adding tikv pod node
[Encountered Problem: Problem Phenomenon and Impact] ["failed to start node: Other("[components/pd_client/src/util.rs:954]: duplicated store address: id:11001
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page

[Attachments: Screenshots/Logs/Monitoring]

[server.rs:975] [“failed to start node: Other("[components/pd_client/src/util.rs:954]: duplicated store address: id:11001 address:\"tidb-cluster-tikv-2.tidb-cluster-tikv-peer.tidb-cluster.svc:20160\" version:\"7.5.0\" peer_address:\"tidb-cluster-tikv-2.tidb-cluster-tikv-peer.tidb-cluster.svc:20160\" status_address:\"0.0.0.0:20180\" git_hash:\"bd8a0aabd08fd77687f788e0b45858ccd3516e4d\" start_timestamp:1706317572 deploy_path:\"/\" , already registered by id:1 address:\"tidb-cluster-tikv-2.tidb-cluster-tikv-peer.tidb-cluster.svc:20160\" labels:<key:\"host\" value:\"node1\" > version:\"7.5.0\" peer_address:\"tidb-cluster-tikv-2.tidb-cluster-tikv-peer.tidb-cluster.svc:20160\" status_address:\"0.0.0.0:20180\" git_hash:\"bd8a0aabd08fd77687f788e0b45858ccd3516e4d\" start_timestamp:1702571204 deploy_path:\"/\" last_heartbeat:1705852996678545391 node_state:Serving ")”] [thread_id=0x4]

| username: Quino | Original post link

Dear experts, what is the cause of this issue and how can it be resolved?
TiDB Operator version: 1.5.2
TiDB version: 7.5.0
Deployed using k8s+kubesphere

| username: Kongdom | Original post link

Repeated? What operations were performed before scaling?

| username: Quino | Original post link

Thank you for the reply!
Before scaling, the tikv-2 node kept reporting the error [“failed to start node: Other(“[components/pd_client/src/util.rs:756]: version should compatible with version 7.5.0, got 5.0.1”)”]. After I specified the image version to 7.5.0, I deleted and recreated the pod several times and also deleted and recreated the persistent volume of the pod. Then this issue occurred.

| username: Kongdom | Original post link

Physical deletion or scaling down through tiup?

| username: Quino | Original post link

Thanks for the reply!
Since it was not deployed using tiup, the node expansion and reduction were done through Kubesphere, and the persistent volumes were manually deleted.


| username: Kongdom | Original post link

:sweat_smile: This is beyond my knowledge. Let’s wait for the K8S team to respond.

The issue has been recorded in the FAQ, but it’s for an older version, not K8S.

| username: Quino | Original post link

Sure :grinning: Thanks for the reply, expert!

| username: Kongdom | Original post link

Sure, here is the translation:

“Take a look at this, I feel the last reply might be helpful
K8S中的集群中的 TiKV无法自动下线。 - TiDB 的问答社区

| username: Quino | Original post link

Thank you for the reply!
The node had previously deployed TiKV, and the scaling down did not clean it up completely.
I would like to ask, where is the directory of the old version of TiKV deployed in this (k8s) way stored? Is it in the persistent volume?

| username: Kongdom | Original post link

:joy: I haven’t used K8S, so I can’t give any advice~

| username: Quino | Original post link

Sure thing :smile: Thanks for the reply, expert!

| username: dba远航 | Original post link

This is probably due to metadata confusion within PD.

| username: 哈喽沃德 | Original post link

The store id of the TiKV node is duplicated, right?

| username: TiDBer_jYQINSnf | Original post link

The old TiKV was scaled down by directly deleting the PV, right? Then execute store delete xxx to delete the old TiKV. After it becomes a tombstone, the new TiKV will automatically come up.