When using dumpling to export the entire database to an SQL file, the --output-filename-template parameter was mistakenly used, causing lighting to fail to recognize and load the exported sql.gz data

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: dumpling整个database为sql文件时,误用了–output-filename-template参数,导致lighting时无法识别加载导出的sql.gz数据

| username: TiDBer_g9JOVuLf

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.4.0
[Reproduction Path] Operations performed that led to the issue
Single machine deployment with 1 TiDB server, 3 TiKV, and 1 PD instance. The TiKV data instance reached 100% disk usage and needed expansion. Due to disk planning and formatting issues, cluster reinstallation was required. First, I used tiup dumpling to export the existing data to the local machine. Later, I found that I mistakenly used the output-filename-template parameter when referring to the export command online. The command run was as follows (IP and other information hidden):

tiup dumpling -u root -P 4000 -h 10.xx.xx.xxx -o /opt/ccc/data/export/ -r 200000 -t 16 --filetype sql --compress gzip --database db1,test -F 256MiB --output-filename-template 'tidb-test.{{.Index}}'

After exporting, I cleaned and uninstalled the TiDB cluster data, then adjusted the topology file and redeployed the cluster. When using the tiup lighting tool to import data, no errors were reported, but no data was imported. The command was as follows:

nohup tiup tidb-lightning -no-schema -config tidb-lightning.toml > nohup.out &

Execution result log:

[2023/03/24 12:53:34.938 +08:00] [WARN] [config.go:827] ["currently only per-task configuration can be applied, global configuration changes can only be made on startup"] ["global config changes"="[lightning.level,lightning.file]"]
[2023/03/24 12:53:34.939 +08:00] [INFO] [lightning.go:382] [cfg] [cfg="{\"id\":1679633614939008092,\"lightning\":{\"table-concurrency\":6,\"index-concurrency\":2,\"region-concurrency\":88,\"io-concurrency\":5,\"check-requirements\":true,\"meta-schema-name\":\"lightning_metadata\",\"max-error\":{\"type\":0},\"task-info-schema-name\":\"lightning_task_info\"},\"tidb\":{\"host\":\"10.xx.xx.xx\",\"port\":4000,\"user\":\"root\",\"status-port\":10080,\"pd-addr\":\"10.xx.xx.xx:2379\",\"sql-mode\":\"ONLY_FULL_GROUP_BY,NO_AUTO_CREATE_USER\",\"tls\":\"false\",\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"redact-info-log\":false},\"max-allowed-packet\":67108864,\"distsql-scan-concurrency\":15,\"build-stats-concurrency\":20,\"index-serial-scan-concurrency\":20,\"checksum-table-concurrency\":2,\"vars\":null},\"checkpoint\":{\"schema\":\"tidb_lightning_checkpoint\",\"driver\":\"file\",\"enable\":true,\"keep-after-success\":\"remove\"},\"mydumper\":{\"read-block-size\":65536,\"batch-size\":0,\"batch-import-ratio\":0,\"source-id\":\"\",\"data-source-dir\":\"file:///opt/ccc/data1/export\",\"character-set\":\"auto\",\"csv\":{\"separator\":\",\",\"delimiter\":\"\\\"\",\"terminator\":\"\",\"null\":\"\\\\N\",\"header\":true,\"trim-last-separator\":false,\"not-null\":false,\"backslash-escape\":true},\"max-region-size\":268435456,\"filter\":[\"!mysql.*\",\"!sys.*\",\"!INFORMATION_SCHEMA.*\",\"!PERFORMANCE_SCHEMA.*\",\"!METRICS_SCHEMA.*\",\"!INSPECTION_SCHEMA.*\"],\"files\":null,\"no-schema\":true,\"case-sensitive\":false,\"strict-format\":false,\"default-file-rules\":true,\"ignore-data-columns\":null,\"data-character-set\":\"binary\",\"data-invalid-char-replace\":\"�\"},\"tikv-importer\":{\"addr\":\"\",\"backend\":\"local\",\"on-duplicate\":\"replace\",\"max-kv-pairs\":4096,\"send-kv-pairs\":32768,\"region-split-size\":0,\"region-split-keys\":0,\"sorted-kv-dir\":\"/opt/ccc/data/sorted-kv-dir\",\"disk-quota\":9223372036854775807,\"range-concurrency\":16,\"duplicate-resolution\":\"none\",\"incremental-import\":false,\"engine-mem-cache-size\":536870912,\"local-writer-mem-cache-size\":134217728,\"store-write-bwlimit\":0},\"post-restore\":{\"checksum\":\"required\",\"analyze\":\"optional\",\"level-1-compact\":false,\"post-process-at-last\":true,\"compact\":false},\"cron\":{\"switch-mode\":\"5m0s\",\"log-progress\":\"5m0s\",\"check-disk-quota\":\"1m0s\"},\"routes\":null,\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"redact-info-log\":false},\"black-white-list\":{\"do-tables\":null,\"do-dbs\":null,\"ignore-tables\":null,\"ignore-dbs\":null}}"]
[2023/03/24 12:53:34.959 +08:00] [INFO] [lightning.go:483] ["load data source start"]
[2023/03/24 12:53:34.974 +08:00] [INFO] [loader.go:450] ["[loader] file is filtered by file router"] [path=metadata]
[2023/03/24 12:53:35.546 +08:00] [INFO] [lightning.go:486] ["load data source completed"] [takeTime=587.32434ms] []
[2023/03/24 12:53:35.546 +08:00] [INFO] [checkpoints.go:1014] ["open checkpoint file failed, going to create a new one"] [path=/tmp/tidb_lightning_checkpoint.pb] []
[2023/03/24 12:53:35.554 +08:00] [INFO] [local.go:578] ["multi ingest support"]
[2023/03/24 12:53:35.554 +08:00] [INFO] [restore.go:448] ["the whole procedure start"]
[2023/03/24 12:53:35.556 +08:00] [INFO] [restore.go:1995] [new_collation_enabled] [enabled=true]
[2023/03/24 12:53:35.556 +08:00] [INFO] [restore.go:756] ["restore all schema start"]
[2023/03/24 12:53:35.556 +08:00] [INFO] [restore.go:772] ["restore all schema completed"] [takeTime=82.701µs] []
[2023/03/24 12:53:35.562 +08:00] [INFO] [version.go:398] ["detect server version"] [type=TiDB] [version=6.4.0]
[2023/03/24 12:53:35.563 +08:00] [INFO] [version.go:398] ["detect server version"] [type=TiDB] [version=6.4.0]
[2023/03/24 12:53:35.567 +08:00] [INFO] [restore.go:1424] ["restore all tables data start"]
[2023/03/24 12:53:35.569 +08:00] [INFO] [restore.go:1309] ["cancel periodic actions"] [do=true]
[2023/03/24 12:53:35.569 +08:00] [INFO] [restore.go:1869] ["switch import mode"] [mode=Import]
[2023/03/24 12:53:35.569 +08:00] [INFO] [restore.go:1869] ["switch import mode"] [mode=Normal]
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:1532] ["restore all tables data completed"] [takeTime=24.970574ms] []
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:1535] ["cleanup task metas"]
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:1829] ["skip full compaction"]
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:1171] ["everything imported, stopping periodic actions"]
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:2018] ["clean checkpoints start"] [keepAfterSuccess=remove] [taskID=1679633614939008092]
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:2026] ["clean checkpoints completed"] [keepAfterSuccess=remove] [taskID=1679633614939008092] [takeTime=41.754µs] []
[2023/03/24 12:53:35.592 +08:00] [INFO] [restore.go:476] ["the whole procedure completed"] [takeTime=37.750328ms] []
[2023/03/24 12:53:35.593 +08:00] [INFO] [main.go:106] ["tidb lightning exit"] [finished=true]

Contents of tidb-lightning.toml:

[lightning]
# Log
level = "info"
file = "tidb-lightning.log"

[tikv-importer]
# Choose the import mode
backend = "local"
# Set the temporary storage location for sorted key-value pairs, the target path needs to be an empty directory
sorted-kv-dir = "/opt/ccc/data/sorted-kv-dir"

[mydumper]
# Source data directory.
data-source-dir = "/opt/ccc/data1/export/"

# Configure wildcard rules, the default rules will filter all tables under the system databases mysql, sys, INFORMATION_SCHEMA, PERFORMANCE_SCHEMA, METRICS_SCHEMA, INSPECTION_SCHEMA
# If this item is not configured, an exception "schema not found" will occur when importing system tables
#filter = ['*.*', '!mysql.*', '!sys.*', '!INFORMATION_SCHEMA.*', '!PERFORMANCE_SCHEMA.*', '!METRICS_SCHEMA.*', '!INSPECTION_SCHEMA.*']
filter = ['!mysql.*', '!sys.*', '!INFORMATION_SCHEMA.*', '!PERFORMANCE_SCHEMA.*', '!METRICS_SCHEMA.*', '!INSPECTION_SCHEMA.*']

[tidb]
# Information of the target cluster
host = "10.xx.xx.xx"
port = 4000
user = "root"
password = "xxx"
# Table schema information is obtained from TiDB's "status port".
status-port = 10080
# Address of the cluster pd
pd-addr = "10.xx.xx.xx:2379"

[Encountered Problem: Issue Phenomenon and Impact]
The dumpling command mistakenly added a parameter, resulting in abnormal data format during export, which cannot be recognized during the lighting stage. How should this be handled to load the data into the database normally?

tiup dumpling -u root -P 4000 -h 10.xx.xx.xxx -o /opt/ccc/data/export/ -r 200000 -t 16 --filetype sql --compress gzip --database db1,test -F 256MiB --output-filename-template 'tidb-test.{{.Index}}'

[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: 胡杨树旁 | Original post link

After reinstalling, are there tables in the database? I see that --no-schema was added during the lightning import. Can the original exported files still be distinguished? If they can be distinguished, can the file names be manually changed?

| username: 啦啦啦啦啦 | Original post link

Without writing {{.Table}}, the SQL file format definitely won’t be recognized. See if you can manually modify the exported file name. If there’s too much data, consider using a script to do it in batches.
Refer to this:

| username: TiDBer_g9JOVuLf | Original post link

Removed -no-schema, but it still didn’t recognize it.

| username: TiDBer_g9JOVuLf | Original post link

The generated data now is in the format of tidb-test.0000341790000.sql.gz, and there are multiple tables under db1. Just by looking at the file name, I can’t distinguish which tidb-test.000xxx.sql.gz belongs to which table’s data…

| username: 啦啦啦啦啦 | Original post link

Try using a script to extract the first few lines of the SQL file after decompression to get the table name, and then loop to rename the file.

| username: TiDBer_g9JOVuLf | Original post link

Smart, I just analyzed the data files and plan to do the same… :smiley:

| username: 胡杨树旁 | Original post link

In the document, there are some library names and table names. Can we check the first 2 lines and categorize the file?

| username: TiDBer_g9JOVuLf | Original post link

I changed it but it still doesn’t work.

It finishes running very quickly without any specific error messages. The import was not successful.
[2023/03/24 17:23:33.383 +08:00] [WARN] [config.go:827] [“currently only per-task configuration can be applied, global configuration changes can only be made on startup”] [“global config changes”=“[lightning.level,lightning.file]”]
[2023/03/24 17:23:33.383 +08:00] [INFO] [lightning.go:382] [cfg] [cfg=“{"id":1679649813383409438,"lightning":{"table-concurrency":6,"index-concurrency":2,"region-concurrency":88,"io-concurrency":5,"check-requirements":true,"meta-schema-name":"lightning_metadata","max-error":{"type":0},"task-info-schema-name":"lightning_task_info"},"tidb":{"host":"10.xx.xx.xx","port":4000,"user":"root","status-port":10080,"pd-addr":"10.xx.xx.xx:2379","sql-mode":"ONLY_FULL_GROUP_BY,NO_AUTO_CREATE_USER","tls":"false","security":{"ca-path":"","cert-path":"","key-path":"","redact-info-log":false},"max-allowed-packet":67108864,"distsql-scan-concurrency":15,"build-stats-concurrency":20,"index-serial-scan-concurrency":20,"checksum-table-concurrency":2,"vars":null},"checkpoint":{"schema":"tidb_lightning_checkpoint","driver":"file","enable":true,"keep-after-success":"remove"},"mydumper":{"read-block-size":65536,"batch-size":0,"batch-import-ratio":0,"source-id":"","data-source-dir":"file:///opt/ccc/data1/export","character-set":"auto","csv":{"separator":",","delimiter":"\"","terminator":"","null":"\\N","header":true,"trim-last-separator":false,"not-null":false,"backslash-escape":true},"max-region-size":268435456,"filter":["!mysql.","!sys.","!INFORMATION_SCHEMA.","!PERFORMANCE_SCHEMA.","!METRICS_SCHEMA.","!INSPECTION_SCHEMA."],"files":null,"no-schema":false,"case-sensitive":false,"strict-format":false,"default-file-rules":true,"ignore-data-columns":null,"data-character-set":"binary","data-invalid-char-replace":"�"},"tikv-importer":{"addr":"","backend":"local","on-duplicate":"replace","max-kv-pairs":4096,"send-kv-pairs":32768,"region-split-size":0,"region-split-keys":0,"sorted-kv-dir":"/opt/ccc/data/sorted-kv-dir","disk-quota":9223372036854775807,"range-concurrency":16,"duplicate-resolution":"none","incremental-import":false,"engine-mem-cache-size":536870912,"local-writer-mem-cache-size":134217728,"store-write-bwlimit":0},"post-restore":{"checksum":"required","analyze":"optional","level-1-compact":false,"post-process-at-last":true,"compact":false},"cron":{"switch-mode":"5m0s","log-progress":"5m0s","check-disk-quota":"1m0s"},"routes":null,"security":{"ca-path":"","cert-path":"","key-path":"","redact-info-log":false},"black-white-list":{"do-tables":null,"do-dbs":null,"ignore-tables":null,"ignore-dbs":null}}”]
[2023/03/24 17:23:33.406 +08:00] [INFO] [lightning.go:483] [“load data source start”]
[2023/03/24 17:23:33.423 +08:00] [INFO] [loader.go:450] [“[loader] file is filtered by file router”] [path=metadata]
[2023/03/24 17:23:34.252 +08:00] [INFO] [lightning.go:486] [“load data source completed”] [takeTime=846.285355ms]
[2023/03/24 17:23:34.252 +08:00] [INFO] [checkpoints.go:1014] [“open checkpoint file failed, going to create a new one”] [path=/tmp/tidb_lightning_checkpoint.pb]
[2023/03/24 17:23:34.261 +08:00] [INFO] [local.go:578] [“multi ingest support”]
[2023/03/24 17:23:34.261 +08:00] [INFO] [restore.go:448] [“the whole procedure start”]
[2023/03/24 17:23:34.262 +08:00] [INFO] [restore.go:1995] [new_collation_enabled] [enabled=true]
[2023/03/24 17:23:34.262 +08:00] [INFO] [restore.go:756] [“restore all schema start”]
[2023/03/24 17:23:34.262 +08:00] [INFO] [restore.go:772] [“restore all schema completed”] [takeTime=49.229µs]
[2023/03/24 17:23:34.268 +08:00] [INFO] [version.go:398] [“detect server version”] [type=TiDB] [version=6.4.0]
[2023/03/24 17:23:34.269 +08:00] [INFO] [version.go:398] [“detect server version”] [type=TiDB] [version=6.4.0]
[2023/03/24 17:23:34.273 +08:00] [INFO] [restore.go:1424] [“restore all tables data start”]
[2023/03/24 17:23:34.276 +08:00] [INFO] [restore.go:1309] [“cancel periodic actions”] [do=true]
[2023/03/24 17:23:34.276 +08:00] [INFO] [restore.go:1869] [“switch import mode”] [mode=Import]
[2023/03/24 17:23:34.276 +08:00] [INFO] [restore.go:1869] [“switch import mode”] [mode=Normal]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:1532] [“restore all tables data completed”] [takeTime=21.927336ms]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:1535] [“cleanup task metas”]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:1829] [“skip full compaction”]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:2018] [“clean checkpoints start”] [keepAfterSuccess=remove] [taskID=1679649813383409438]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:1171] [“everything imported, stopping periodic actions”]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:2026] [“clean checkpoints completed”] [keepAfterSuccess=remove] [taskID=1679649813383409438] [takeTime=43.489µs]
[2023/03/24 17:23:34.295 +08:00] [INFO] [restore.go:476] [“the whole procedure completed”] [takeTime=33.94224ms]
[2023/03/24 17:23:34.295 +08:00] [INFO] [main.go:106] [“tidb lightning exit”] [finished=true]

| username: TiDBer_g9JOVuLf | Original post link

The original files range from tidb-test.0000000010000.sql.gz to tidb-test.0000341790000.sql.gz.

Among them,
tidb-test.0000000010000.sql.gz → db1.tb1.0000000010000.sql.gz
tidb-test.0000000020000.sql.gz → db1.tb1.0000000020000.sql.gz

tidb-test.0000000030000.sql.gz → db1.tb2.0000000030000.sql.gz

tidb-test.0000024240000.sql.gz → db1.tb2.0000024240000.sql.gz

tidb-test.0000024250000.sql.gz → db1.tb3.0000024250000.sql.gz

tidb-test.0000341790000.sql.gz → db1.tb3.0000341790000.sql.gz

| username: 胡杨树旁 | Original post link

The exported file is in SQL text format. Can I use the local mode when importing? I haven’t tested this before. Generally, I use the TiDB mode to import when exporting in SQL mode. You can try importing a small file first. Do I need to decompress it before importing?

| username: ljluestc | Original post link

# Step 1: Verify that the data dump created using tiup dump is correct and not corrupted.
# Import the data dump into a new TiDB cluster and check if the data is successfully imported.
tiup playground --db 2 --pd 1 --kv 3 --tiflash 0 --monitor false
# In the new TiDB cluster, run the following command to import the data dump:
# tiup restore --host <tidb_host> --port <tidb_port> --user <tidb_user> --password <tidb_password> --status-addr <tidb_status_address> --backup <backup_dir> --schema <schema_name>
# Replace <tidb_host>, <tidb_port>, <tidb_user>, <tidb_password>, <tidb_status_address>, <backup_dir>, and <schema_name> with appropriate values.

# Step 2: Check the topology file to ensure it is set up correctly.
# Verify that the TiKV data directory is set correctly and that data is being written to the correct directory.
# You can check the TiKV logs to see if there are any errors indicating incorrect data writes.

# Step 3: Check the tidb-lightning configuration file to ensure it is set correctly.
# Ensure the data dump path is correct and the import settings are configured properly.
# You can also check the tidb-lightning logs to see if there are any errors indicating incorrect configuration.
# Here is an example of a tidb-lightning configuration file:
# tidb-server-addr = "<tidb_host>:<tidb_port>"
# pd-addr = "<pd_address>"
# tikv-importer.backend = "local"
# tikv-importer.sorted-kv-dir = "<sorted_kv_dir>"
# checkpoint.enable = true
# checkpoint.driver = "file"
# checkpoint.schema = "<checkpoint_schema_name>"
# checkpoint.keep-after-success = "remove"
# data-source-dir = "<data_dump_dir>"
# no-schema = true
# Replace <tidb_host>, <tidb_port>, <pd_address>, <sorted_kv_dir>, <checkpoint_schema_name>, and <data_dump_dir> with appropriate values.

# Step 4: Run the tidb-lightning tool again and check the logs for any error messages.
nohup tiup tidb-lightning -no-schema -config tidb-lightning.toml > nohup.out &
# Check the nohup.out file for any error messages.