Error Occurred While Restoring Data Using Logical Backup

translator_bot · June 22, 2024, 6:49pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用逻辑备份恢复数据的时候报错

| username: TiDBer_zdfhM5qW

[Test Environment for TiDB] Testing
[TiDB Version] 6.5.0
[Encountered Issues: Symptoms and Impact]
[error=“Error 9001: PD server timeout”]
[error=“Error 8027: Information schema is out of date: schema failed to update in 1 lease, please make sure TiDB can connect to TiKV”]
Deployed a single-node TiDB on a 32C32G server, not a cluster, using the logical recovery tool tidb-lightning for restoration.

The address used is 127.0.0.1, and the backup size is 300M.

translator_bot · June 22, 2024, 6:49pm

| username: caiyfc | Original post link

Check if there is any OOM on each node, and then see what mode Lightning is using?

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

This part is the configuration

[lightning]
level = "info"
file = "/tmp/tidb-lightning.log"
max-size = 128 # MB
max-days = 28
max-backups = 14
#check-requirements = false

[tikv-importer]
backend = "tidb"
sorted-kv-dir = "/tmp/sorted-kv-dir"
on-duplicate = "replace"

[mydumper]
data-source-dir = "/data1/open_data"

[tidb]
host = "127.0.0.1"
port = 4000
user = "root"
password = "test123#@"
status-port = 10080
pd-addr = "127.0.0.1:2379"
log-level = "error"

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Using the TiDB mode.

translator_bot · June 22, 2024, 6:49pm

| username: caiyfc | Original post link

Did you only deploy a single TiDB node without deploying PD and TiKV?

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Deployed, only flash is not deployed. This is the log.

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

The image you provided is not visible. Please provide the text you need translated.

translator_bot · June 22, 2024, 6:49pm

| username: caiyfc | Original post link

Based on the logs, it looks like the node ran out of memory (OOM). You deployed it at 8:54, and encountered an error while importing with Lightning at 9:59, right?

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Yes, I restarted the cluster at 8:54, and then started importing data.

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

How to handle OOM, increase the memory?

translator_bot · June 22, 2024, 6:49pm

| username: caiyfc | Original post link

First, you need to confirm whether the lightning import will cause an OOM. You can try using lightning to import again now, reproduce the error, and then check if the start time of each node matches the import time. If it does, then it definitely caused an OOM. If it did cause an OOM, you need to check the specific memory usage or try reducing the number of threads in lightning.

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Thank you, thank you. I’ll try reducing the number of threads. I have 3 files here, and there were no errors when restoring 500M of data. But when restoring 300G, the errors start appearing after a while.

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Which parameter should I use to reduce the number of threads?

translator_bot · June 22, 2024, 6:49pm

| username: caiyfc | Original post link

Here: TiDB Lightning Configuration Parameters | PingCAP Documentation Center

The parameters under [lightning] can all be adjusted.

translator_bot · June 22, 2024, 6:49pm

| username: TiDBer_zdfhM5qW | Original post link

Thank you very much.

translator_bot · June 22, 2024, 6:49pm

| username: jansu-dev | Original post link

Did you solve it later? The concurrency control for TiDB backend is region-concurrency.