Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 使用逻辑备份恢复数据的时候报错
[Test Environment for TiDB] Testing
[TiDB Version] 6.5.0
[Encountered Issues: Symptoms and Impact]
[error=“Error 9001: PD server timeout”]
[error=“Error 8027: Information schema is out of date: schema failed to update in 1 lease, please make sure TiDB can connect to TiKV”]
Deployed a single-node TiDB on a 32C32G server, not a cluster, using the logical recovery tool tidb-lightning for restoration.
The address used is 127.0.0.1, and the backup size is 300M.
Check if there is any OOM on each node, and then see what mode Lightning is using?
This part is the configuration
[lightning]
level = "info"
file = "/tmp/tidb-lightning.log"
max-size = 128 # MB
max-days = 28
max-backups = 14
#check-requirements = false
[tikv-importer]
backend = "tidb"
sorted-kv-dir = "/tmp/sorted-kv-dir"
on-duplicate = "replace"
[mydumper]
data-source-dir = "/data1/open_data"
[tidb]
host = "127.0.0.1"
port = 4000
user = "root"
password = "test123#@"
status-port = 10080
pd-addr = "127.0.0.1:2379"
log-level = "error"
Did you only deploy a single TiDB node without deploying PD and TiKV?
Deployed, only flash is not deployed. This is the log.
The image you provided is not visible. Please provide the text you need translated.
Based on the logs, it looks like the node ran out of memory (OOM). You deployed it at 8:54, and encountered an error while importing with Lightning at 9:59, right?
Yes, I restarted the cluster at 8:54, and then started importing data.
How to handle OOM, increase the memory?
First, you need to confirm whether the lightning import will cause an OOM. You can try using lightning to import again now, reproduce the error, and then check if the start time of each node matches the import time. If it does, then it definitely caused an OOM. If it did cause an OOM, you need to check the specific memory usage or try reducing the number of threads in lightning.
Thank you, thank you. I’ll try reducing the number of threads. I have 3 files here, and there were no errors when restoring 500M of data. But when restoring 300G, the errors start appearing after a while.
Which parameter should I use to reduce the number of threads?
Did you solve it later? The concurrency control for TiDB backend is region-concurrency.