TiKV data is stored in 3 replicas, with 5 TiKV nodes configured. Can 3 TiKV nodes go down?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv的数据存储为3副本,配置了5个tikv节点,是否是可以宕3台tikv

| username: TiDBer_Y2d2kiJh

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v5.4.0 2tidb 3pd 3tikv
[Reproduction Path] The current cluster has 3 tikv nodes, with data stored in 3 replicas. We plan to deploy 5 tikv and 6 pd. After deploying 5 tikv, will the data still be intact if 3 tikv nodes go down? How many tikv nodes can go down with 5 tikv nodes and 5 replicas?
[Encountered Issues: Issue Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots / Logs / Monitoring]

| username: 像风一样的男子 | Original post link

If labels are not set, at most one machine can go down; if two machines go down, some data will become unavailable.

| username: zhanggame1 | Original post link

No, by default, with three replicas, only one TiKV can fail without data loss. Adding more TiKV nodes still only allows one TiKV to fail.

With the default three replicas, the more TiKV nodes you have, the lower the cluster’s reliability.

| username: hey-hoho | Original post link

In extreme cases, if the three replicas of a region happen to be on three downed TiKV nodes, then the data of that region is lost.

Data availability is only related to the number of replicas and has little to do with the number of TiKV nodes. For example, even with 10 TiKV nodes, high availability with three replicas only allows one node to be down.

| username: TiDBer_Y2d2kiJh | Original post link

In the case of 5 TiKV server nodes and 5 replicas, how many TiKV nodes can go down?

| username: TiDBer_Y2d2kiJh | Original post link

Understood, with 3 replicas of TiKV, no matter how many servers there are, only one TiKV can go down.

| username: TiDBer_Y2d2kiJh | Original post link

Got it, thanks.

| username: 大飞哥online | Original post link

3 replicas of TiKV can tolerate 1 node failure.
5 replicas of TiKV can tolerate 2 node failures.

The number of replicas and TiKV nodes are not necessarily equal. You can have 10 TiKV nodes with 3 replicas or 5 replicas. Similarly, you can have 5 TiKV nodes with 3 replicas or 5 replicas.

| username: TiDBer_Y2d2kiJh | Original post link

With 5 TiKV nodes and 5 replicas, you can tolerate 2 TiKV nodes going down. If you currently have 5 TiKV nodes and 3 replicas, can you directly configure it to 5 TiKV replicas?

| username: tidb菜鸟一只 | Original post link

Sure, you can directly modify max-replicas=5, but keep in mind that 5 replicas will occupy much more space.

| username: cassblanca | Original post link

Sure, just set max-replicas=5. PS. The number of TiKV instances cannot be lower than the number of TiKV replicas.

| username: 大飞哥online | Original post link

Sure, now the data is absolutely safe, hahaha. As long as there is enough disk space, it’s fine. The disk cost is a bit higher.

| username: cassblanca | Original post link

I wouldn’t dare say it’s absolute. If the rack goes down, it’s GG. Even remote disaster recovery can’t guarantee absolute safety. We can only aim to make RTO and RPO as close to zero as possible.

| username: 大飞哥online | Original post link

Hahaha, that’s true. We also need to consider the extreme case of a complete power outage in the data center.

| username: Fly-bird | Original post link

It is not possible.