Error in TiDB Backup and Restore

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tidb 备份还原出错

| username: TiDBer_tfNa3eoP

The issue has been resolved. It seems that the BR restore method does not support restoring so many tables at once. One database has up to 100,000 tables, which causes the DDL queuing situation seen in ADMIN SHOW DDL JOBS; when attempting a restore.

In the end, we had to restore the tables logically and then use BR for the restore.

【TiDB Usage Environment】Poc
【TiDB Version】7.5
【Reproduction Path】Backed up several times with errors, then killed the process and restarted
【Encountered Problem: Symptoms and Impact】Executed ADMIN SHOW DDL JOBS; showing many statuses as queueing;
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】

| username: TiDBer_q2eTrp5h | Original post link

Please provide the cluster deployment status and the current cluster resource usage.

| username: TiDBer_tfNa3eoP | Original post link

The hard drive shows an error because it is mounted on another server, but the actual usage is less than 20%.

| username: Ming | Original post link

What is the backup error message?

| username: zhanggame1 | Original post link

Who executed these DDLs? Also, check if there are any MDL locks.

| username: TiDBer_tfNa3eoP | Original post link

The backup is normal, but it cannot be restored. There are 60 databases, and only 4 of them have issues. Upon execution, it was found that there are DDLs queued for these four databases.

| username: TiDBer_tfNa3eoP | Original post link

It is automatically executed during the restoration.
How can I check it?

| username: 小龙虾爱大龙虾 | Original post link

Try restarting all TiDB nodes.

| username: TiDBer_tfNa3eoP | Original post link

Reinitializing doesn’t work either.

| username: tidb菜鸟一只 | Original post link

Was it restored through BR backup, by database or the entire cluster?

| username: zhanggame1 | Original post link

Are you using physical backup or logical backup?

| username: TIDB-Learner | Original post link

BR is a physical backup.

| username: yytest | Original post link

When you execute the ADMIN SHOW DDL JOBS; command and see many DDL jobs with the status queueing, it means these operations have entered the DDL job queue but have not yet started executing because they are waiting for the preceding DDL tasks to complete. This situation may be due to insufficient system resources, too many DDL tasks, or some tasks taking too long to execute.

To resolve this issue, you can consider the following steps:

  1. Check System Resources: Ensure that your system has enough resources to handle DDL jobs. If resource usage is high, you may need to reduce other operations or increase resources.
  2. Optimize DDL Jobs: Try to reduce the number of unnecessary DDL jobs or break down large DDL jobs into smaller parts so they can be completed more quickly.
  3. Adjust DDL Job Priority: If possible, try to adjust the priority of DDL jobs so that the most important jobs are executed first.
  4. Monitor DDL Job Execution: Use tools or commands provided by TiDB to monitor the execution of DDL jobs to promptly identify and resolve issues.
| username: 小于同学 | Original post link

Restart it.

| username: TiDBer_tfNa3eoP | Original post link

The issue has been resolved. It seems that the BR restore method does not support restoring so many tables at once. One database has up to 100,000 tables, which causes the DDL queuing situation seen in ADMIN SHOW DDL JOBS; when attempting to restore. In the end, we had to restore the tables logically and then use BR for the restore.

| username: tony5413 | Original post link

Could you explain this? I didn’t understand.

| username: TiDBer_tfNa3eoP | Original post link

I saw other teachers’ answers, it is because when BR is restoring, there is a table splitting issue, and this parameter needs to be turned off. I haven’t tried it yet, but it should work.

| username: tony5413 | Original post link

Thank you, I’ll take a look.