[SOP Series 36] Database Backup and Recovery (TiDB)

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【SOP 系列 36】数据库备份与恢复(TiDB)

| username: Billmay表妹

Thanks to @会飞的土拨鼠 for sharing

Backup Tools

1. Tool Download Address

# In the download link, {version} is the version number of Dumpling. For example, the download link for version v5.2.2 is
https://download.pingcap.org/tidb-toolkit-v5.2.2-linux-amd64.tar.gz

2. Tool Content Details

[root@ncayu8847 tidb-toolkit-v5.2.2-linux-amd64]# cd bin/
[root@ncayu8847 bin]# ll
Total 535180
-rwxr-xr-x 1 root root 115875760 Oct 29 11:26 br
-rwxr-xr-x 1 root root  79583835 Oct 29 11:26 dumpling
-rwxr-xr-x 1 root root  12782832 Oct 29 11:26 mydumper
-rwxr-xr-x 1 root root  17258378 Oct 29 11:26 pd-tso-bench
-rwxr-xr-x 1 root root  71600963 Oct 29 11:26 sync_diff_inspector
-rwxr-xr-x 1 root root 129306304 Oct 29 11:26 tidb-lightning
-rwxr-xr-x 1 root root 121594400 Oct 29 11:26 tidb-lightning-ctl

It is officially recommended to use the br tool for data backup and recovery. Due to environmental limitations, we are currently using the dumpling tool for backup and the tidb-lightning tool for data recovery.

Database Backup with Dumpling

Backup Command

# Backup command:
./dumpling -P 4000 -h 192.168.70.133 -u root -p ncayu123456 --filetype sql -B ncayu_test -t 1 --loglevel warn -F 64MiB -o /data1/tidb_tool/backup/ncayu_test1130
# Parameter Explanation:
-P  Port
-h  Database IP address
-u  Database user
-p  Database password
--filetype sql  Save backup file as SQL type
-B  Specify the name of the database to be backed up
-t  Specify the number of threads for export
--loglevel Output log level
Log levels {debug, info, warn, error, dpanic, panic, fatal} Default is "info"
-F 64MiB  Specify the size of a single SQL backup file
-o /data1/tidb_tool/backup/ncayu_test1130  Specify the path to store the backup files

Database Recovery with tidb-lightning

1. Start tidb-lightning

It is recommended to run tidb-lightning. If you start the program directly in the command line with nohup, it may exit due to the SIGHUP signal. It is recommended to put nohup in a script.

Method 1: Start the program first, then add the import configuration on the page
mkdir /tmp/tidb-lightning

#!/bin/bash
nohup ./tidb-lightning --server-mode --status-addr :8289 > /tmp/tidb-lightning/lightning.log 2>&1 &

Method 2: Modify the configuration file directly and start
#!/bin/bash
nohup ./tidb-lightning -config tidb-lightning.toml > nohup.out &

2. Content of tidb-lightning.toml File

[lightning]

# Log
level = "info"
file = "/data1/tidb_tool/backup/push_log.log"

#[checkpoint]
#enable = true
# Name of the database to store checkpoints.
#schema = "tidb_lightning_checkpoint"
# Method to store checkpoints.
#  - file: Store in the local file system.
#  - mysql: Store in a MySQL-compatible database server.
#driver = "file"

[tikv-importer]
# Set backend to local mode
backend = "local"
# Set the local temporary storage path
sorted-kv-dir = "/tmp/sorted-kv-dir"

[mydumper]
# Mydumper source data directory.
data-source-dir = "/data1/tidb_tool/backup/ncayu_test1201"

[tidb]
# Information of the target cluster. The listening address of tidb-server, fill in one.
# Note here, use the TiDB IP address from the Cluster Info in the TiDB management console
host = "192.168.70.133"
port = 4000
user = "root"
password = "ncayu123456"
# Table schema information is obtained from the "status port" of TiDB.
status-port = 10080
# Address of pd-server, fill in one
# Note here, use the PD information from the Cluster Info in the TiDB management console
pd-addr = "192.168.70.133:2379"

In the tidb-lightning.toml file, for different TiDB databases, you need to modify the corresponding database information and the data content path.

3. Adjust the GC Time of the TiDB Cluster

If the data volume is large, it is best to extend the GC (garbage collection) time before importing.

# Adjust the GC time of the TiDB cluster
SELECT * FROM mysql.tidb WHERE VARIABLE_NAME = 'tikv_gc_life_time';
update mysql.tidb set VARIABLE_VALUE = '10h' where VARIABLE_NAME = 'tikv_gc_life_time';
# Change back to the original value after the import is complete
update mysql.tidb set VARIABLE_VALUE = '10m' where VARIABLE_NAME = 'tikv_gc_life_time';

4. Note the Abnormal Exit of Lightning

# If tidb-lightning exits abnormally, the cluster may still be in "import mode", which is not suitable for production environments. You can execute the following command to check the current mode:
tidb-lightning-ctl --fetch-mode
# You can execute the following command to force switch back to "normal mode":
tidb-lightning-ctl --switch-mode=normal

5. Import Completion

After the import is complete, TiDB Lightning will automatically exit. If the import is successful, the last line of the log will show

tidb lightning exit.

Part of the Logs During Data Import

1. nohup.out File

[root@ncayu618 bin]# tail -f nohup.out 
Verbose debug logs will be written to /data1/tidb_tool/backup/push_log.log

+---+------------------------------------------------------------------------------------------------+-------------+--------+
| # | CHECK ITEM                                                                                     | TYPE        | PASSED |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 1 | Source csv files size is proper                                                                | performance | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 2 | checkpoints are valid                                                                          | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 3 | table schemas are valid                                                                        | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 4 | Cluster is available                                                                           | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 5 | Lightning has the correct storage permission                                                   | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 6 | local disk resources are rich, estimate sorted data size 7.689GiB, local available is 45.29GiB | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+
| 7 | Cluster capacity is rich, capacity is 2.402TiB, we need 23.07GiB                               | critical    | true   |
+---+------------------------------------------------------------------------------------------------+-------------+--------+

2. push_log.log Log File

[2021/12/01 16:34:40.805 +08:00] [INFO] [restore.go:1221] ["add back PD leader&region schedulers"]
[2021/12/01 16:34:40.805 +08:00] [INFO] [restore.go:1224] ["cleanup task metas"]
[2021/12/01 16:34:42.973 +08:00] [INFO] [restore.go:1173] ["cancel periodic actions"] [do=true]
[2021/12/01 16:34:42.974 +08:00] [INFO] [restore.go:1038] ["switch to normal mode"]
[2021/12/01 16:34:43.310 +08:00] [WARN] [meta_manager.go:843] ["there are unfinished table in table meta table, cleanup skipped."]
[2021/12/01 16:34:43.310 +08:00] [INFO] [restore.go:1488] ["skip full compaction"]
[2021/12/01 16:34:43.318 +08:00] [INFO] [restore.go:1660] ["clean checkpoints start"] [keepAfterSuccess=false] [taskID=1638344968777845215]
[2021/12/01 16:34:43.319 +08:00] [INFO] [restore.go:1667] ["clean checkpoints completed"] [keepAfterSuccess=false] [taskID=1638344968777845215] [takeTime=105.15µs] []
[2021/12/01 16:34:43.319 +08:00] [INFO] [restore.go:431] ["the whole procedure completed"] [takeTime=45m14.509755009s] []
[2021/12/01 16:34:43.320 +08:00] [INFO] [main.go:93] ["tidb lightning exit"]
[2021/12/01 16:34:43.320 +08:00] [INFO] [checksum.go:468] ["service safe point keeper exited"]

| username: 会飞的土拨鼠 | Original post link

Thank you, hehe.

| username: Billmay表妹 | Original post link

Thank you for your contribution.

| username: Hacker_xUwtuKxa | Original post link

How did you evaluate the storage space size when using Lightning for local import?

| username: TiDBer_小阿飞 | Original post link

Why, after completing the configuration and executing the command, there are no errors reported, but the data import does not start?

Execution statement: tiup tidb-lightning -config lightning.toml --server-mode --status-addr :8290
After submitting the command in the foreground and the toml file, there is no response, and the background logs do not report any errors.

Configuration is as follows:

[lightning]
level = "info"
file = "tidb-lightning.log"
server-mode = true
status-addr = ':8290'
check-requirements = true

[mydumper]
data-source-dir = "/data"

[tikv-importer]
backend = "local"
duplicate-resolution = 'remove'
sorted-kv-dir = "./some-dir"

[tidb]
host = "192.168.80.201"
port = 4000
user = "root"
status-port = 10080
pd-addr = "192.168.80.201:2379"
log-level = "error"

Logs are as follows:

[root@localhost tidb-community-toolkit-v6.5.3-linux-amd64]# cat tidb-lightning.log 
[2023/09/06 02:48:02.725 -07:00] [INFO] [lightning.go:266] ["starting HTTP server"] [address="[::]:8290"]
[2023/09/06 02:48:02.725 -07:00] [INFO] [lightning.go:283] ["Lightning server is running, post to /tasks to start an import task"] [address="[::]:8290"]
[2023/09/06 02:48:23.750 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/tasks]
[2023/09/06 02:48:23.750 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/tasks] [status=200] [body="{\"current\":null,\"queue\":[]}\n"] [takeTime=161.62µs] []
[2023/09/06 02:48:23.751 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/progress/task]
[2023/09/06 02:48:23.752 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/pause]
[2023/09/06 02:48:23.752 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/pause] [status=200] [body="{\"paused\":false}"] [takeTime=11.887µs] []
[2023/09/06 02:48:23.753 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/progress/task] [status=200] [] [takeTime=1.432538ms] []
[2023/09/06 02:49:30.381 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=POST] [url=/tasks]
[2023/09/06 02:49:30.381 -07:00] [INFO] [lightning.go:757] ["received task config"] [content="[lightning]\n# 日志\nlevel = \"info\"\nfile = \"tidb-lightning.log\"\nmax-size = 128 # MB\nmax-days = 28\nmax-backups = 14\nserver-mode = true\nstatus-addr = ':8290'\n\n# 启动之前检查集群是否满足最低需求。\ncheck-requirements = true\n\n[mydumper]\n# 本地源数据目录或外部存储 URL\ndata-source-dir = \"/data\"\n\n[tikv-importer]\n# 导入模式配置,设为 local 即使用 Physical Import Mode\nbackend = \"local\"\n\n# 冲突数据处理方式\nduplicate-resolution = 'remove'\n\n# 本地进行 KV 排序的路径。\nsorted-kv-dir = \"./some-dir\"\n\n# 限制 TiDB Lightning 向每个 TiKV 节点写入的带宽大小,默认为 0,表示不限制。\n# store-write-bwlimit = \"128MiB\"\n\n[tidb]\n# 目标集群的信息。tidb-server 的地址,填一个即可。\nhost = \"192.168.80.201\"\nport = 4000\nuser = \"root\"\n# 设置连接 TiDB 的密码,可为明文或 Base64 编码。\npassword = \"\"\n# 必须配置。表结构信息从 TiDB 的“status-port”获取。\nstatus-port = 10080\n# 必须配置。pd-server 的地址,填一个即可。\npd-addr = \"192.168.80.201:2379\"\n# tidb-lightning 引用了 TiDB 库,并生成产生一些日志。\n# 设置 TiDB 库的日志等级。\nlog-level = \"error\"\n\n\n"]
[2023/09/06 02:49:30.383 -07:00] [WARN] [config.go:1070] ["currently only per-task configuration can be applied, global configuration changes can only be made on startup"] ["global config changes"="[lightning.level,lightning.file,lightning.max-size,lightning.max-days,lightning.max-backups,lightning.server-mode,lightning.status-addr,tidb.log-level]"]
[2023/09/06 02:49:30.385 -07:00] [WARN] [config.go:1070] ["currently only per-task configuration can be applied, global configuration changes can only be made on startup"] ["global config changes"="[lightning.level,lightning.file,lightning.max-size,lightning.max-days,lightning.max-backups,lightning.server-mode,lightning.status-addr,tidb.log-level]"]
[2023/09/06 02:49:30.385 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=POST] [url=/tasks] [status=200] [body="{\"id\":1693993770385294614}\n"] [takeTime=4.099549ms] []
[2023/09/06 02:49:30.386 -07:00] [INFO] [info.go:49] ["Welcome to TiDB-Lightning"] [release-version=v7.2.0] [git-hash=9fd5f4a8e4f273a60fbe7d3848f85a1be8f0600b] [git-branch=heads/refs/tags/v7.2.0] [go-version=go1.20.5] [utc-build-time="2023-06-27 14:58:41"] [race-enabled=false]
[2023/09/06 02:49:30.386 -07:00] [INFO] [lightning.go:426] [cfg] [cfg="{\"id\":1693993770385294614,\"lightning\":{\"table-concurrency\":6,\"index-concurrency\":2,\"region-concurrency\":8,\"io-concurrency\":5,\"check-requirements\":true,\"meta-schema-name\":\"lightning_metadata\",\"max-error\":{\"type\":0,\"conflict\":9223372036854775807},\"max-error-records\":100,\"task-info-schema-name\":\"lightning_task_info\"},\"tidb\":{\"host\":\"192.168.80.201\",\"port\":4000,\"user\":\"root\",\"status-port\":10080,\"pd-addr\":\"192.168.80.201:2379\",\"sql-mode\":\"ONLY_FULL_GROUP_BY,NO_AUTO_CREATE_USER\",\"tls\":\"false\",\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"redact-info-log\":false},\"max-allowed-packet\":67108864,\"distsql-scan-concurrency\":15,\"build-stats-concurrency\":20,\"index-serial-scan-concurrency\":20,\"checksum-table-concurrency\":2,\"vars\":null},\"checkpoint\":{\"schema\":\"tidb_lightning_checkpoint\",\"driver\":\"file\",\"enable\":true,\"keep-after-success\":\"remove\"},\"mydumper\":{\"read-block-size\":65536,\"batch-size\":0,\"batch-import-ratio\":0,\"source-id\":\"\",\"data-source-dir\":\"file:///data\",\"character-set\":\"auto\",\"csv\":{\"separator\":\",\",\"delimiter\":\"\\\"\",\"terminator\":\"\",\"null\":[\"\\\\N\"],\"header\":true,\"header-schema-match\":true,\"trim-last-separator\":false,\"not-null\":false,\"backslash-escape\":true,\"escaped-by\":\"\\\\\"},\"max-region-size\":268435456,\"filter\":[\"*.*\",\"!mysql.*\",\"!sys.*\",\"!INFORMATION_SCHEMA.*\",\"!PERFORMANCE_SCHEMA.*\",\"!METRICS_SCHEMA.*\",\"!INSPECTION_SCHEMA.*\"],\"files\":null,\"no-schema\":false,\"case-sensitive\":false,\"strict-format\":false,\"default-file-rules\":true,\"ignore-data-columns\":null,\"data-character-set\":\"binary\",\"data-invalid-char-replace\":\"�\"},\"tikv-importer\":{\"addr\":\"\",\"backend\":\"local\",\"on-duplicate\":\"\",\"max-kv-pairs\":4096,\"send-kv-pairs\":32768,\"send-kv-size\":16384,\"compress-kv-pairs\":\"\",\"region-split-size\":0,\"region-split-keys\":0,\"region-split-batch-size\":4096,\"region-split-concurrency\":8,\"region-check-backoff-limit\":1800,\"sorted-kv-dir\":\"./some-dir\",\"disk-quota\":9223372036854775807,\"range-concurrency\":16,\"duplicate-resolution\":\"remove\",\"incremental-import\":false,\"keyspace-name\":\"\",\"add-index-by-sql\":false,\"engine-mem-cache-size\":536870912,\"local-writer-mem-cache-size\":134217728,\"store-write-bwlimit\":0,\"pause-pd-scheduler-scope\":\"table\"},\"post-restore\":{\"checksum\":\"required\",\"analyze\":\"optional\",\"level-1-compact\":false,\"post-process-at-last\":true,\"compact\":false,\"checksum-via-sql\":true},\"cron\":{\"switch-mode\":\"5m0s\",\"log-progress\":\"5m0s\",\"check-disk-quota\":\"1m0s\"},\"routes\":null,\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"redact-info-log\":false},\"black-white-list\":{\"do-tables\":null,\"do-dbs\":null,\"ignore-tables\":null,\"ignore-dbs\":null}}"]
[2023/09/06 02:49:30.388 -07:00] [INFO] [lightning.go:526] ["load data source start"]
[2023/09/06 02:49:30.390 -07:00] [INFO] [loader.go:459] ["[loader] file is filtered by file router"] [path=TEST_TIDB.sql]
[2023/09/06 02:49:30.391 -07:00] [INFO] [lightning.go:529] ["load data source completed"] [takeTime=2.196233ms] []
[2023/09/06 02:49:30.448 -07:00] [INFO] [lightning.go:567] ["acquired keyspace name"] [keyspaceName=]
[2023/09/06 02:49:30.448 -07:00] [INFO] [checkpoints.go:1115] ["open checkpoint file failed, going to create a new one"] [path=/tmp/tidb_lightning_checkpoint.pb] []
[2023/09/06 02:49:30.550 -07:00] [INFO] [local.go:680] ["multi ingest support"]
[2023/09/06 02:49:30.551 -07:00] [INFO] [import.go:483] ["the whole procedure start"]
[2023/09/06 02:49:30.560 -07:00] [INFO] [import.go:1945] [new_collation_enabled] [enabled=true]
[2023/09/06 02:49:30.561 -07:00] [INFO] [import.go:789] ["restore all schema start"]
[2023/09/06 02:49:30.561 -07:00] [INFO] [import.go:808] ["restore all schema completed"] [takeTime=258.485µs] []
[2023/09/06 02:49:30.617 -07:00] [INFO] [version.go:429] ["detect server version"] [type=TiDB] [version=6.5.3]
[2023/09/06 02:49:30.621 -07:00] [INFO] [version.go:429] ["detect server version"] [type=TiDB] [version=6.5.3]
[2023/09/06 02:49:30.686 -07:00] [INFO] [import.go:1466] ["restore all tables data start"]
[2023/09/06 02:49:30.721 -07:00] [INFO] [import.go:1351] ["cancel periodic actions"] [do=true]
[2023/09/06 02:49:30.721 -07:00] [INFO] [tikv_mode.go:59] ["switch tikv mode"] [mode=Normal]
[2023/09/06 02:49:30.721 -07:00] [INFO] [tikv_mode.go:59] ["switch tikv mode"] [mode=Import]
[2023/09/06 02:49:30.830 -07:00] [INFO] [import.go:1580] ["restore all tables data completed"] [takeTime=144.380605ms] []
[2023/09/06 02:49:30.830 -07:00] [INFO] [import.go:1583] ["cleanup task metas"]
[2023/09/06 02:49:30.845 -07:00] [INFO] [import.go:1817] ["skip full compaction"]
[2023/09/06 02:49:30.846 -07:00] [INFO] [import.go:1968] ["clean checkpoints start"] [keepAfterSuccess=remove] [taskID=1693993770385294614]
[2023/09/06 02:49:30.846 -07:00] [INFO] [import.go:1976] ["clean checkpoints completed"] [keepAfterSuccess=remove] [taskID=1693993770385294614] [takeTime=211.935µs] []
[2023/09/06 02:49:30.846 -07:00] [INFO] [import.go:511] ["the whole procedure completed"] [takeTime=295.435294ms] []
[2023/09/06 02:49:30.854 -07:00] [WARN] [import.go:1203] ["stopping periodic actions"] [error="context canceled"]
[2023/09/06 02:49:30.893 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/tasks]
[2023/09/06 02:49:30.894 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/tasks] [status=200] [body="{\"current\":null,\"queue\":[]}\n"] [takeTime=72.234µs] []
[2023/09/06 02:49:30.895 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/pause]
[2023/09/06 02:49:30.895 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/pause] [status=200] [body="{\"paused\":false}"] [takeTime=19.355µs] []
[2023/09/06 02:49:30.895 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/progress/task]
[2023/09/06 02:49:30.898 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/progress/task] [status=200] [] [takeTime=2.382474ms] []
[2023/09/06 02:50:28.552 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/tasks]
[2023/09/06 02:50:28.552 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/tasks] [status=200] [body="{\"current\":null,\"queue\":[]}\n"] [takeTime=60.121µs] []
[2023/09/06 02:50:28.552 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/progress/task]
[2023/09/06 02:50:28.552 -07:00] [INFO] [lightning.go:200] ["process http request start"] [method=GET] [url=/pause]
[2023/09/06 02:50:28.552 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/pause] [status=200] [body="{\"paused\":false}"] [takeTime=20.062µs] []
[2023/09/06 02:50:28.554 -07:00] [INFO] [lightning.go:209] ["process http request completed"] [method=GET] [url=/progress/task] [status=200] [] [takeTime=1.768129ms] []
[2023/09/06
| username: TiDBer_rvITcue9 | Original post link

Thank you for sharing.

| username: 这里介绍不了我 | Original post link

Thank you for the summary.

| username: TiDBer_5cwU0ltE | Original post link

This is a good source for sharing. The operation commands are clear, and the results are explicit.

| username: redgame | Original post link

:clap::clap::clap:

| username: FutureDB | Original post link

Not bad, the operation steps are quite clear. It would be even more complete if BR was included.

| username: TiDBer_aaO4sU46 | Original post link

Thank you for sharing.

| username: DBAER | Original post link

Awesome. Not bad.

| username: 我是吉米哥 | Original post link

Thank you for sharing.