Can a cluster with only one TiDB node be directly forcibly scaled down if that node crashes?

translator_bot · June 21, 2024, 5:03am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 集群只有一个tidb节点，该节点宕机了可以直接强制缩容不？

| username: Hacker_qeCXjAN2

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】5.3.3
【Reproduction Path】What operations were performed when the issue occurred
【Encountered Issue: Problem Phenomenon and Impact】The cluster has only one TiDB node, and now the machine has failed. How can this be handled? Thank you!
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】

translator_bot · June 21, 2024, 5:03am

| username: 小龙虾爱大龙虾 | Original post link

Technically, it is possible to scale down and then scale back up later, but after scaling down, since there is no TiDB server, the business will be unable to connect.

translator_bot · June 21, 2024, 5:03am

| username: Hacker_qeCXjAN2 | Original post link

Okay, I’ll give it a try, thank you!

translator_bot · June 21, 2024, 5:03am

| username: zhanggame1 | Original post link

TiDB nodes can be scaled down first and then scaled up.

translator_bot · June 21, 2024, 5:03am

| username: 像风一样的男子 | Original post link

Sure.

translator_bot · June 21, 2024, 5:03am

| username: Hacker_qeCXjAN2 | Original post link

Initially, TiKV had 3 nodes: 1.3, 1.4, and 1.5. After forcibly scaling down nodes 1.4 and 1.5, only 1.3 was left. After scaling down and then scaling up the TiDB nodes again, the TiDB nodes couldn’t start and kept logging connection information with 1.4 and 1.5, but at this point, 1.4 and 1.5 no longer exist. How can this be resolved to get the TiDB nodes up and running? Thank you!

translator_bot · June 21, 2024, 5:03am

| username: Hacker_qeCXjAN2 | Original post link

Initially, there were 3 TiKV nodes: 1.3, 1.4, and 1.5. Nodes 1.4 and 1.5 were forcibly scaled down, leaving only 1.3. After scaling down and then scaling up the TiDB nodes again, the TiDB nodes fail to start and continuously log connection information with 1.4 and 1.5, even though 1.4 and 1.5 no longer exist. How can this be resolved to get the TiDB nodes up and running? Thank you!

translator_bot · June 21, 2024, 5:03am

| username: xingzhenxiang | Original post link

Force scale-in, then scale-out again.

translator_bot · June 21, 2024, 5:03am

| username: zhanggame1 | Original post link

TiKV with 3 replicas cannot be scaled down to fewer than 3 replicas, as it may result in data loss. The specific recovery operations are as follows:
Online Unsafe Recovery Documentation | PingCAP Documentation Center

translator_bot · June 21, 2024, 5:03am

| username: 江湖故人 | Original post link

Your 3-node scale-down seems to be stuck. You should expand TiKV back first.

translator_bot · June 21, 2024, 5:03am

| username: 小龙虾爱大龙虾 | Original post link

TiKV cannot be used this way. When there are only 3 nodes in TiKV, forcibly scaling down 2 nodes requires special recovery. Refer to the link posted above.

translator_bot · June 21, 2024, 5:03am

| username: 这里介绍不了我 | Original post link

This is too aggressive, Online Unsafe Recovery 使用文档 | PingCAP 文档中心

translator_bot · June 21, 2024, 5:03am

| username: 像风一样的男子 | Original post link

Did you confuse the TiDB node with the TiKV node?

translator_bot · June 21, 2024, 5:03am

| username: changpeng75 | Original post link

Can you play like this in a production environment? There’s a high chance of data loss. If you have a backup, redeploy a new system for recovery.

translator_bot · June 21, 2024, 5:03am

| username: zhanggame1 | Original post link

I feel that the description is problematic. In the future, it would be better to display the cluster so that we can determine whether it can be scaled down.

translator_bot · June 21, 2024, 5:03am

| username: tidb菜鸟一只 | Original post link

If three TiKV nodes are forcibly scaled down to two, the cluster has already lost data. How can TiDB still connect…

translator_bot · June 21, 2024, 5:03am

| username: andone | Original post link

Sure.

translator_bot · June 21, 2024, 5:03am

| username: 春风十里 | Original post link

Are you trying to test the robustness of the cluster? Reducing and expanding TiDB when only one TiKV is left seems inherently risky. The events that could occur at this time are unpredictable.

translator_bot · June 21, 2024, 5:03am

| username: 有猫万事足 | Original post link

Your title and opening content are about TiDB, but after scaling down, you mentioned TiKV. This joke is too much.

translator_bot · June 21, 2024, 5:03am

| username: dba远航 | Original post link

Sure.