Solutions for Cold Data Backup to Other Storage

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 数据冷备到其他存储的方案

| username: 不输土豆

[TiDB Usage Environment] Production Environment
[TiDB Version] V6.5.0
[Reproduction Path] Normal backup of TiDB service
[Encountered Problem: Phenomenon and Impact] TiDB service backs up data to disk, ensuring data security under extreme cases of TiDB service
[Resource Configuration]

The TiDB service can work normally, but due to hardware failures of online machines, extreme bit flips in memory, extreme bit flips in network cards, network environment, and unknown bugs in the code, TiDB Meta data may get corrupted, causing the TiDB service to fail to restart normally, etc. In such extreme cases, clear the cluster data and reload the cold backup data into the new TiDB cluster service.
Two scenarios need to be considered:

  • Existing data
  • Incremental data
    During data cold backup, the old TiDB service should not be affected; incremental data backup should be a real-time task, and data entries should not have duplicate entries.

Is there any expert who can provide a feasible solution?

| username: TiDBer_jYQINSnf | Original post link

Use BR for full and incremental backups.

| username: xingzhenxiang | Original post link

Learn about BR + S3 storage.

| username: 不输土豆 | Original post link

You don’t need to consider cold backup storage media. I wanted to ask what tools can be used. The person above mentioned the BR tool, so I’ll check it out.

| username: changpeng75 | Original post link

Hardware failures are resolved by multiple replicas, bit flips are handled by memory’s ECC checks. If you’re worried about data and metadata corruption, using BR for full and incremental backups is sufficient; cold backups are not strictly necessary. If you must have a cold backup, use CDC to synchronize a backup database, and stop the backup database when performing the cold backup.

| username: lemonade010 | Original post link

Our cluster V6.5 uses mounted NAS, then BR backup with incremental backup, and the data on the NAS is uniformly captured by the backup device.

| username: xiexin | Original post link

Introduction to Data Migration and Verification Tools

| username: dba-kit | Original post link

  • For logical full backup, you can consider using Dumpling, but it has a certain impact on cluster performance, is relatively inefficient, and does not support incremental backups.
  • Both physical full backup and incremental backup can be handled with BR. Especially the PITR backup feature introduced in version 6.5, similar to MySQL’s binlog, allows direct incremental output of cluster changes to storage. For detailed documentation, see: TiDB 日志备份与 PITR 使用指南 | PingCAP 文档中心
| username: Kongdom | Original post link

Generally, BR backup is considered.

| username: DBAER | Original post link

Just back up the br.

| username: redgame | Original post link

We also use BR for testing, it’s fine.

| username: zhanggame1 | Original post link

BR + logs can be used, but there will be a delay of 1 to 2 minutes. In extreme cases, data may be lost from the database, and recovery will also result in some data loss.

| username: Fly-bird | Original post link

It is recommended to use CDC to synchronize data to the standby database in the production environment. Perform BR and incremental backups on the standby database to storage. In case of an emergency where the primary database encounters issues, you can directly switch to the standby database.

| username: dba远航 | Original post link

You can add real-time logs to BR.

| username: 数据库真NB | Original post link

Real-time data: BR+CDC
Historical stock data: Full S3

| username: porpoiselxj | Original post link

A full BR backup once a week + 7*24 BR log incremental backups, with historical data retained for more than a week, allows recovery to any point within the week. Note that the latest 2-3 minutes of BR log increments may be lost.

Additionally, if you want to minimize data loss or achieve faster recovery, and hardware resources permit, you can set up a backup cluster with TiCDC in near real-time (TiCDC should be stable and usable, recommended version v7.1.3 or above, as previous versions had various bugs).

| username: TiDBer_ivan0927 | Original post link

You can use BR backup + restore or Dumpling + Lightning.

| username: 像风一样的男子 | Original post link

I wrote a new backup plan, not sure if it will be useful to you.

| username: Hacker_PtIIxHC1 | Original post link

You can use BR backup or Dumpling backup. In practice, BR backup is faster and suitable for large data volumes (but external storage needs to support the S3 protocol).

| username: TiDBer_HErMeXDz | Original post link

Try br + s3