Issues with Multi-Instance Label Configuration and Replica Migration

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 多实例label配置副本迁移问题

| username: mono

TiDB version: 5.4.2
Cluster environment: 3 physical TiKV machines, each running 2 TiKV instances.
Problem description:
After adding the label configuration, when using tiup cluster reload ** -R tikv,pd to make the cluster configuration take effect, it migrates the leader replicas. Since there are two non-leader replicas stored on the same server, why not migrate the non-leader replicas? Would migrating the non-leader replicas have less impact on the cluster?

(1) Restart log:
Evicting 2 leaders from store 172.20..:20161…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Still waiting for 2 store leaders to transfer…
Restarting instance 172.20..:20161
Restart instance 172.20..:20161 success
Evicting 1 leaders from store 172.20..:20161…
Still waiting for 1 store leader to transfer…
Still waiting for 1 store leader to transfer…
Still waiting for 1 store leader to transfer…
Still waiting for 1 store leader to transfer…
Restarting instance 172.20..:20161
Restart instance 172.20..:20161 success

| username: h5n1 | Original post link

Reloading and restarting require switching the leader.

| username: mono | Original post link

Why switch? The instance hasn’t crashed! If the leader and non-leader are on the same physical machine, migrating the non-leader would be better, right?

| username: xingzhenxiang | Original post link

A new leader will be elected, and the data has not been migrated. It seems that by default, it will restart directly after more than 5 minutes.

| username: xingzhenxiang | Original post link

This is the official documentation:

| username: 像风一样的男子 | Original post link

Restarting the node definitely requires re-electing the leader, not migrating data.

| username: mono | Original post link

My situation is like this. Previously, two TiKV instances on server A stored two replicas of table t1 without label configuration. The peer_ids were 1238 (leader) and 1239. After configuring the labels, I wanted to distribute the two replicas across three physical machines. After reloading, I found that the leader replica 1238 had migrated. What I expected was for the 1238 replica to remain in place and the 1239 replica to migrate.

| username: 大飞哥online | Original post link

Reload is a sequential polling restart, so the leader needs to be re-elected. During this process, the database can still provide services. If there is no re-election, it will not be able to provide services.