Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: BR恢复失败,怎么删除所有恢复的数据,重新恢复呢

How to delete all restored data and restore again after BR restore fails?
Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: BR恢复失败,怎么删除所有恢复的数据,重新恢复呢
How to delete all restored data and restore again after BR restore fails?
Just directly drop all the restored database tables, right? Then will the local space be cleaned up during its GC time?
Bro, can you help me check this issue? It indicates a restore failure. Can I find that one failed restore? Can I find this one and restore it separately? This way, I don’t have to restore everything. Restoring everything takes 10 hours. It’s too troublesome.
[2023/12/29 03:12:54.730 +08:00] [INFO] [collector.go:188] [“Full restore Failed summary : total restore files: 99378, total success: 99377, total failed: 1”] [“split region”=28m54.978152613s] [“restore ranges”=88703] [Size=2279671975326] [unitName=file] [error=“rpc error: code = Unavailable desc = transport is closing”] [errorVerbose=“rpc error: code = Unavailable desc = transport is closing\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).ingestSST\n\tgithub.com/pingcap/br@/pkg/restore/import.go:480\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import.func1\n\tgithub.com/pingcap/br@/pkg/restore/import.go:276\ngithub.com/pingcap/br/pkg/utils.WithRetry\n\tgithub.com/pingcap/br@/pkg/utils/retry.go:46\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import\n\tgithub.com/pingcap/br@/pkg/restore/import.go:222\ngithub.com/pingcap/br/pkg/restore.(*Client).RestoreFiles.func2\n\tgithub.com/pingcap/br@/pkg/restore/client.go:584\ngithub.com/pingcap/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\tgithub.com/pingcap/br@/pkg/utils/worker.go:63\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357”]
[2023/12/29 03:12:54.730 +08:00] [ERROR] [restore.go:35] [“failed to restore”] [error=“rpc error: code = Unavailable desc = transport is closing”] [errorVerbose=“rpc error: code = Unavailable desc = transport is closing\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).ingestSST\n\tgithub.com/pingcap/br@/pkg/restore/import.go:480\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import.func1\n\tgithub.com/pingcap/br@/pkg/restore/import.go:276\ngithub.com/pingcap/br/pkg/utils.WithRetry\n\tgithub.com/pingcap/br@/pkg/utils/retry.go:46\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import\n\tgithub.com/pingcap/br@/pkg/restore/import.go:222\ngithub.com/pingcap/br/pkg/restore.(*Client).RestoreFiles.func2\n\tgithub.com/pingcap/br@/pkg/restore/client.go:584\ngithub.com/pingcap/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\tgithub.com/pingcap/br@/pkg/utils/worker.go:63\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357”] [stack=“main.runRestoreCommand\n\tgithub.com/pingcap/br@/cmd/br/restore.go:35\nmain.newFullRestoreCommand.func1\n\tgithub.com/pingcap/br@/cmd/br/restore.go:120\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.0.0/command.go:842\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.0.0/command.go:950\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.0.0/command.go:887\nmain.main\n\tgithub.com/pingcap/br@/cmd/br/main.go:56\nruntime.main\n\truntime/proc.go:203”]
[2023/12/29 03:12:54.730 +08:00] [ERROR] [main.go:58] [“br failed”] [error=“rpc error: code = Unavailable desc = transport is closing”] [errorVerbose=“rpc error: code = Unavailable desc = transport is closing\ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ngithub.com/pingcap/errors.Trace\n\tgithub.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).ingestSST\n\tgithub.com/pingcap/br@/pkg/restore/import.go:480\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import.func1\n\tgithub.com/pingcap/br@/pkg/restore/import.go:276\ngithub.com/pingcap/br/pkg/utils.WithRetry\n\tgithub.com/pingcap/br@/pkg/utils/retry.go:46\ngithub.com/pingcap/br/pkg/restore.(*FileImporter).Import\n\tgithub.com/pingcap/br@/pkg/restore/import.go:222\ngithub.com/pingcap/br/pkg/restore.(*Client).RestoreFiles.func2\n\tgithub.com/pingcap/br@/pkg/restore/client.go:584\ngithub.com/pingcap/br/pkg/utils.(*WorkerPool).ApplyOnErrorGroup.func1\n\tgithub.com/pingcap/br@/pkg/utils/worker.go:63\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.0.0-20201020160332-67f06af15bc9/errgroup/errgroup.go:57\nruntime.goexit\n\truntime/asm_amd64.s:1357”] [stack=“main.main\n\tgithub.com/pingcap/br@/cmd/br/main.go:58\nruntime.main\n\truntime/proc.go:203”]
(END)
That’s really unfortunate. It’s hard to tell which one has the issue. How about backing them up separately and restoring them one by one?
Yes, directly drop all the restored database tables. After deletion, you can re-execute BR restore for data recovery.
The local space will be automatically cleaned and released after the system’s GC.
If you know which table failed to restore, BR Restore supports restoring data for a specific table. I recall that you just need to add the appropriate filter rules during the restoration process. You can refer to the official documentation for detailed instructions.
First, deleting the database allows you to perform a recovery again.
Then, if you want to improve the recovery speed, you can refer to
BR is not a full restore? Can’t you just delete the original data and start over?
Which table failed to restore? How can I check this and where can I find relevant information? It only gives a “fail” message, and I don’t know if it’s a table, a region, or a local file. I want to restore just this one, since there are so many, and only one failed.
It prompted an error, and the official documentation says it’s because the performance is too low, so reduce the speed.
Check the failed information to see if there is any schema-related information.
The information I posted is all there is. I can’t see any related information.
If the hardware performance is not up to standard, you can only try reducing the speed~
Is reducing the speed --retelimit? I saw this in the official documentation. I have already set it to --ratelimit 70 to limit it to 70m/s, but why do I still see the TiKV server write speed exceeding 100?
Check the spelling, and also, is TiKV a dedicated server? The rate limit here should be for each TiKV node, not for the entire TiKV or the server it resides on.
The
--ratelimit
option limits the speed at which each TiKV executes recovery tasks (in MiB/s).