Primary Node Failure

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 主节点宕机

| username: TiDBer_nrpaXrZw

v6.1.1
The primary node has crashed and is no longer usable. The tiup command is on the primary node, and currently, without the tiup command, cluster operations cannot be performed.
tiup has been deployed on the backup node, but it cannot be associated with the old cluster.
How can it be associated?

| username: tidb菜鸟一只 | Original post link

Is there a backup of the original node’s tiup configuration?
Or, if the main node’s disk also failed, can’t the original files be retrieved?

| username: TiDBer_nrpaXrZw | Original post link

No backup, just a disk failure :joy:

| username: tidb菜鸟一只 | Original post link

Is the topology file still available?

| username: TiDBer_nrpaXrZw | Original post link

The files are not backed up, but I have a template, so I should be able to simulate the previous configuration.

| username: tidb菜鸟一只 | Original post link

Configure the original cluster information on the backup node, topology.yaml (you need to enter all the information of the original cluster nodes, including IP, port, configuration, and labels, all of which should be the same as the original).

Then execute:

tiup cluster deploy tidb-xxx ./topology.yaml
tiup cluster display tidb-xxx

Check if you can see the cluster…

| username: TiDBer_nrpaXrZw | Original post link

Alright, thank you.

| username: TiDB_C罗 | Original post link

I’ve also thought about this scenario. I wonder if future versions will allow connecting to any node to generate cluster information.

| username: Jellybean | Original post link

The version of tiup and the cluster configuration should have an option to automatically back up to other locations to ensure safety. If a risk like the one mentioned by the original poster occurs, it can be quite serious. It’s like driving on a highway and suddenly losing the steering wheel, making it impossible to manage the cluster.

| username: redgame | Original post link

It’s quite troublesome without any backups.

| username: zhanggame1 | Original post link

The tiup cluster comes with backup and restore functionality. After installing the cluster, it is best to back up and save the configuration.

| username: zhanggame1 | Original post link

There is a solution, and it is not complicated. Refer to the documentation:
Column - What to do if the central control of the TiDB cluster is unavailable? | TiDB Community

| username: 像风一样的男子 | Original post link

It looks quite serious. It scared me into quickly doing an automatic backup with tiup.

| username: 大飞哥online | Original post link

As long as you have the configuration file topology.yaml, everything is easy. Just place it in the tiup directory on the new machine, then use tiup cluster display to check, and it’s OK.

| username: 大飞哥online | Original post link

If there is no topology.yaml file, you can only manually write all the information of the cluster. If there are only a few TiKV, TiDB, and TiFlash instances, it’s fine. But if there are many, it will be quite laborious.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.