A TiKV node has encountered a storage failure and its service has been stopped. Can TiKV be directly scaled out now?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 一个tikv节点存储出故障,将这个tikv的服务停止了,请问现在能直接进行tikv扩容吗?

| username: TiDBer_Y2d2kiJh

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v5.4.0 2tidb 3pd 3tikv
[Reproduction Path] There are a total of 3 tikv nodes. One tikv node has a storage failure, and the service for this tikv has been stopped. Can tikv be directly scaled out now?
[Encountered Issues: Problem Symptoms and Impact]
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: 像风一样的男子 | Original post link

You can refer to this article

| username: tidb菜鸟一只 | Original post link

Yes, with a three-node, three-replica TiKV setup, losing one node does not affect usage. You should promptly add a new TiKV node and then scale down the original faulty node.

| username: TiDBer_vfJBUcxl | Original post link

Sure, the services have all been stopped.

| username: Jellybean | Original post link

Sure, go ahead and scale up, the data won’t be lost.

| username: Amy_Jing | Original post link

Yes, you can. Here are some steps you can refer to:

  1. First, ensure that the faulty TiKV node whose service was stopped has been removed, and that the other nodes in the cluster are working normally.

  2. Install and configure a new TiKV node on a new physical or virtual machine. Ensure that the configuration of this node is consistent with the other working nodes, including storage paths, cluster addresses, etc.

  3. Add the new TiKV node on the PD console of the TiKV cluster. The PD console can usually be accessed via http://<pd_ip>:<pd_port>.

  4. Add the new TiKV node to the TiKV cluster. You can use systemctl start tikv or similar commands to start the TiKV service on the new node.

Once the new TiKV node successfully joins the cluster, it will start receiving and processing data. Ensure that the performance of the new node is comparable to the other nodes, and perform data migration and load balancing in a timely manner to ensure the stability and performance of the entire cluster.

| username: redgame | Original post link

A three-node, three-replica TiKV setup can tolerate one node failure without affecting usage.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.