A Question About BR "Supporting Cold Data Backup to External Storage"

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于br “支持数据冷备份到外部存储” 的一点疑问

| username: 滴滴嗒嘀嗒

Cold backup: The database system needs to be shut down before performing the relevant data backup.

So, does shutting down the TiDB database system mean shutting down all nodes in the cluster or just the TiDB nodes?

If you shut down everything: running the BR backup command will result in an error (cannot connect to PD).


After PD is turned on: the BR log will report an error (cannot connect to TiKV).

After TiKV is turned on, the backup task can run successfully.

So, it seems that the cold backup mentioned in the official documentation refers to shutting down all TiDB nodes? Is this understanding correct? Can shutting down only the TiDB nodes ensure the consistency of the backup data?

| username: 裤衩儿飞上天 | Original post link

It should be “hot standby” here. You can move the post to the feedback section for feedback~

| username: zhanggame1 | Original post link

Cold backup refers to copying the database while it is stopped. The term might be incorrect; it should be written as physical backup.

| username: 滴滴嗒嘀嗒 | Original post link

However, if I shut down all TiDB nodes to ensure no data is written during the backup, although the database system is not completely shut down, it is not considered to be running normally, right? In this case, doesn’t it contradict the definition of “hot backup”?

| username: 滴滴嗒嘀嗒 | Original post link

Stopping all TiDB nodes = stopping the database? Or is it considered stopping part of the database? :smile:

| username: 裤衩儿飞上天 | Original post link

  1. For all database software, cold backup generally refers to directly copying physical files.
  2. TiDB has a storage-compute separation architecture. You can simply understand the TiDB server as the SQL entry point, while data storage is in KV. Shutting down the TiDB server can be simply understood as just closing the SQL operation entry point; the data in the KV layer can still be operated through the API.
| username: cassblanca | Original post link

Cold backup generally requires downtime, and it is a relatively simple and straightforward file backup.

| username: 滴滴嗒嘀嗒 | Original post link

Is the shutdown mainly to prohibit data writing? For a distributed architecture like TiDB, stopping all TiDB instances would prohibit data writing. In that case, does stopping TiDB equal a shutdown? :smile:

| username: 滴滴嗒嘀嗒 | Original post link

So it seems that with this kind of distributed architecture, stopping TiDB does not equal downtime, right? There’s another point I’m not sure if I understand correctly. If the backup takes a long time and there are data writes during this backup process, due to MVCC, the data written during this period will not actually affect the final backup result, right? When BR issues a backup command, there is a parameter called “backupts.” TiKV will only back up the KV values that match the backupts, and the data written during the backup does not match the backupts, so it will not affect the final backup result.

| username: 裤衩儿飞上天 | Original post link

Yes.
This type of backup is called snapshot backup on the official website. If you do not specify backupts, then BR will select the snapshot corresponding to the start time of the backup.
Combining it with log backup can achieve PITR (Point-In-Time Recovery).

| username: zhanggame1 | Original post link

From the official 303 course, BR is hot backup, while cold backup of the database specifically refers to stopping the database and then copying the data files.


| username: zhanggame1 | Original post link

The consistency of backup data is ensured by MVCC in BR, and the backed-up data is from a specific point in time.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.