How to restore TiDB to a specific point in time?

translator_bot · June 23, 2024, 12:02pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb怎么按时间点恢复？

| username: otjoy

The official documentation of BR provides examples of restoring a single table, but it doesn’t explain how to use binlog logs for point-in-time recovery. If I backed up the data the previous day, wouldn’t I lose a day’s worth of data during recovery? I want to know if there are any tools to consume TiDB’s binlog, or does TiDB not use binlog for recovery?

translator_bot · June 23, 2024, 12:02pm

| username: weixiaobing | Original post link

You can refer to this Reparo 使用文档 | PingCAP 文档中心 for incremental recovery of binlog.

translator_bot · June 23, 2024, 12:02pm

| username: Gin | Original post link

Full backup and full restore: BR or Dumpling
Incremental backup: Drainer’s file mode output (tidb-binlog)
Incremental restore: Reparo
The timestamp for connecting full and incremental backups is recorded in the meta file in the backup directory of BR and Dumpling.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

Is this the only way? Then I still need to install dumpling and drainer for this. According to the official recommendation, dumpling requires 3 nodes and drainer requires one node, which is still a bit resource-intensive!

translator_bot · June 23, 2024, 12:02pm

| username: Gin | Original post link

Dumpling is a logical backup tool similar to mydumper. Are you saying that pump requires 3 nodes? Currently, that’s the case. In future versions, once BR implements the incremental file backup feature, the entire PITR functionality will be covered by BR alone.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

I would like to ask further, my current goal is to generate binlog like MySQL for recovery purposes. But it seems that TiDB requires installing pump (to collect data) and drainer (to parse data). Both pump and drainer will record a binlog, and then reparo uses the data parsed by drainer for recovery, so I can’t just install pump. Drainer also needs to specify the downstream dest_type, but I don’t need to specify it, so I don’t need drainer either. How can this be resolved?

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

translator_bot · June 23, 2024, 12:02pm

| username: tidb狂热爱好者 | Original post link

You can refer to this Reparo 使用文档 | PingCAP 文档中心 for incremental recovery of binlog.

You can extend the GC time and enable the flashback feature. There is an official tutorial. Recovering data is very simple and doesn’t require any tools.

translator_bot · June 23, 2024, 12:02pm

| username: Gin | Original post link

Currently, incremental file backup uses drainer’s file output mode, which is db-type = file.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

I’m worried that the entire cluster might go down and commands won’t be executable.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

Binlog is the last resort.

translator_bot · June 23, 2024, 12:02pm

| username: tidb狂热爱好者 | Original post link

Your concern is unnecessary. In a distributed system, there is no situation where a single point of failure occurs. Secondly, if there is a sudden power outage and all your machines go down, there won’t be any issues. Of course, the UPS in the data center needs to be installed.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

In theory, you are right, but I always feel more secure using binlog. What if the flashback method gets GC’d, or it takes a long time to realize a table is missing? May I ask if your production environment uses flashback + BR (or Dumpling) method? Have you not enabled binlog?

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

In our business, there are many update scenarios. At the initial stage of the project launch, I set tidb_gc_life_time to 48 hours, not daring to set it for too long.

translator_bot · June 23, 2024, 12:02pm

| username: tidb狂热爱好者 | Original post link

There will be many problems in 48 hours. I only dare to set it to 10 minutes.

translator_bot · June 23, 2024, 12:02pm

| username: HACK | Original post link

It seems that point-in-time recovery is not supported yet.

translator_bot · June 23, 2024, 12:02pm

| username: otjoy | Original post link

What are the industry-standard backup methods for TiDB? Can anyone provide some guidance? Do we really not need to enable binlog? We should enable it, right?

translator_bot · June 23, 2024, 12:02pm

| username: tidb狂热爱好者 | Original post link

Don’t worry about TiDB crashing, it’s impossible.

translator_bot · June 23, 2024, 12:02pm

| username: tidb狂热爱好者 | Original post link

TiDB can be overwhelmed but it never has issues.

translator_bot · June 23, 2024, 12:02pm

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.