Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 如果在使用br备份的过程中有数据写入,会对备份结果造成影响吗?
For example, if a BR backup takes 1 hour (let’s say from 9:00 to 10:00), and I insert a large amount of data into a table at 9:30, will it affect the final backup result? Or will the final backup result be consistent with the data at the start of the backup (9:00)?
If it is a full backup, BR uses snapshot read during the backup. By default, BR will select the snapshot corresponding to the start time of the backup. Therefore, the conclusion is that the data is consistent with the data at the start time of the backup (9:00).
Of course, you can also explicitly specify the physical time point corresponding to the snapshot for backup. If the data of the snapshot has been garbage collected (GC), the br backup command will report an error and exit. The best practice is to estimate and increase the GC collection time before the backup, and then adjust it back after the backup is completed.
There is no impact. BR backup reads blocks directly and exports consistent data at a specific point in time through the MVCC mechanism.
There is no impact. From the time you start the full backup until the backup is completed, any new data added during this period will not be backed up.
No impact, snapshot read. However, backups may affect your SQL performance. It is not recommended to perform backups and heavy read/write operations simultaneously.
There will be no impact, and the br backup will automatically adjust the GC.
What about log backups? Similarly, assuming the log starts at 9:00 and stops at 10:00, and a large amount of data is inserted into a table at 9:30. Will this affect the final backup result?
When backing up, isn’t there a TSO value? It captures the current time, and all backups are based on the value at this time point. If there are inserts, updates, or deletes during this period, it will automatically use MVCC to find the corresponding original value at that time.
The default tikv_gc_life_time
is 10 minutes. From version 4.0.9 onwards, the official documentation states that if there is a BR backup, this parameter will automatically change. This can be understood as the retention time for historical data.
10 minutes means that all updated data within 10 minutes can be queried. Regardless of how many operations you perform during the backup, the result of the backup will be the data at the TSO time point when the backup started.
Log backup is essentially incremental backup, continuously backing up KV change logs. So it will never stop unless you manually stop it. Once you start log backup, all KV changes will be backed up to storage.
No, there is an MVCC mechanism.
The backup time goes to the snapshot time due to the MVCC mechanism, so it is not affected.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.