Is it normal for node anomalies to occur during the BR recovery process?

As shown in the picture:

BR probably crashed the node.

All PD nodes are down, which is definitely abnormal;
Check the logs to see what is reported.

There seems to be an issue with the restore operation. When you executed the PITR restore, you didn’t specify the full backup restore path, right? Or you can choose not to use PITR and use full instead; this should allow you to restore.

First, restore to normal since PD is not working. Start with a full restore using the restore point.

I had already performed a full restore before the restore point, and they were executed separately.

Your cluster only has two KV nodes? You need at least 3 nodes.

The node distribution is unreasonable, causing crashes.

First, check the PD logs to see why it crashed. Once it has returned to normal, then check the BR logs.

It’s very strange. I ran the recovery several times, and during the entire recovery process, the status of these nodes alternated between normal and abnormal. There were two final outcomes:

  1. After the recovery was completed, everything returned to normal, and the data was successfully restored.
  2. The nodes remained abnormal, and the recovery task ultimately failed.
There have been reports of “not leader”.

There are some ERRORs in the PD logs, but I’m not quite sure what the root cause of the node crash is.

It’s not normal, check the logs.

Why is it unreasonable? Are you referring to deploying multiple roles on a single machine?

This shouldn’t have much of an impact, right?

Executed separately, first the full backup was restored, then the PITR (Point-In-Time Recovery) was restored.

What kind of topology is this… 2 PD, 2 TiDB, 2 TiKV…

The resources are not sufficient, it’s for local testing :grinning:. Is it necessary to strictly follow the recommended topology?

Check the logs.

For testing, it’s better to directly use tiup playground, or set up 1 PD, 1 TiDB, 1 TiKV, and configure TiKV with 1 replica…