Will data writes during the use of BR backup affect the backup results?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 如果在使用br备份的过程中有数据写入,会对备份结果造成影响吗?

| username: 滴滴嗒嘀嗒

For example, if a BR backup takes 1 hour (let’s say from 9:00 to 10:00), and I insert a large amount of data into a table at 9:30, will it affect the final backup result? Or will the final backup result be consistent with the data at the start of the backup (9:00)?

| username: Daniel-W | Original post link

There will be no impact.

| username: Jellybean | Original post link

If it is a full backup, BR uses snapshot read during the backup. By default, BR will select the snapshot corresponding to the start time of the backup. Therefore, the conclusion is that the data is consistent with the data at the start time of the backup (9:00).

Of course, you can also explicitly specify the physical time point corresponding to the snapshot for backup. If the data of the snapshot has been garbage collected (GC), the br backup command will report an error and exit. The best practice is to estimate and increase the GC collection time before the backup, and then adjust it back after the backup is completed.

| username: zhanggame1 | Original post link

There is no impact. BR backup reads blocks directly and exports consistent data at a specific point in time through the MVCC mechanism.

| username: wfxxh | Original post link

There is no impact. From the time you start the full backup until the backup is completed, any new data added during this period will not be backed up.

| username: TiDBer_小阿飞 | Original post link

No, the MVCC mechanism.

| username: 昵称想不起来了 | Original post link

No impact, snapshot read. However, backups may affect your SQL performance. It is not recommended to perform backups and heavy read/write operations simultaneously.

| username: Kongdom | Original post link

There will be no impact, and the br backup will automatically adjust the GC.

| username: 滴滴嗒嘀嗒 | Original post link

What about log backups? Similarly, assuming the log starts at 9:00 and stops at 10:00, and a large amount of data is inserted into a table at 9:30. Will this affect the final backup result?

| username: 舞动梦灵 | Original post link

When backing up, isn’t there a TSO value? It captures the current time, and all backups are based on the value at this time point. If there are inserts, updates, or deletes during this period, it will automatically use MVCC to find the corresponding original value at that time.

The default tikv_gc_life_time is 10 minutes. From version 4.0.9 onwards, the official documentation states that if there is a BR backup, this parameter will automatically change. This can be understood as the retention time for historical data.

10 minutes means that all updated data within 10 minutes can be queried. Regardless of how many operations you perform during the backup, the result of the backup will be the data at the TSO time point when the backup started.

| username: wfxxh | Original post link

Log backup is essentially incremental backup, continuously backing up KV change logs. So it will never stop unless you manually stop it. Once you start log backup, all KV changes will be backed up to storage.

| username: TiDBer_QYr0vohO | Original post link

No, there is an MVCC mechanism.

| username: Fly-bird | Original post link

The backup time goes to the snapshot time due to the MVCC mechanism, so it is not affected.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.