Importing Data via tidb-lightning: Unable to Match Multiple Source Data Files

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 通过tidb-lightning导入数据,无法匹配多个源数据文件

| username: linuxmysql

[TiDB Environment] Test/
[TiDB Version] v6.1.0

The data files exported by dumpling are as follows:
nsy_scm.stock.0000000000000.sql
nsy_scm.stock.0000000010000.sql
nsy_scm.stock.0000000020000.sql
nsy_scm.stock.0000000030000.sql
nsy_scm.stock.0000000040000.sql
nsy_scm.stock.0000000050000.sql
nsy_scm.stock_business.0000000000000.sql

The tidb-lightning configuration file is as follows:
[lightning]
#region-concurrency=

level=“info”
file=“tidb-lightning.log”

[tikv-importer]
backend=“local”
sorted-kv-dir=“/tmp”

[mydumper]
data-source-dir=“/home/tidb/dumpling/sql”
filter = [‘.’,‘!mysql.‘,’!sys.’,‘!INFORMATION_SCHEAM.‘,’!PERFORMANCE_SCHEMA.’,‘!METRICS_SCHEMA.‘,’!INSPECTION_SCHEMA.’]

[tidb]
host=“172.16.1.201”
port=4000
user=“root”
password=“123456”
status-port=10080
pd-addr=“172.16.1.202:2379”

Imported into TiDB using the following command:
tiup tidb-lightning -config tidb-lightning.toml

Found an issue:
The table and data for nsy_scm.stock_business.0000000000000.sql are present.

However, for the following files:
nsy_scm.stock.0000000000000.sql
nsy_scm.stock.0000000010000.sql
nsy_scm.stock.0000000020000.sql
nsy_scm.stock.0000000030000.sql
nsy_scm.stock.0000000040000.sql
nsy_scm.stock.0000000050000.sql

The corresponding tables are created, but the data is zero.

Do I need to configure a regular expression to match multiple files?

| username: wakaka | Original post link

Could you please provide the tidb-lightning.log file for us to take a look?

| username: tidb狂热爱好者 | Original post link

[lightning]

Log

level = “info”

file = “1tidb-lightning.log”

max-error = 9223372036854775807 ##Fault tolerance is the most important

[tikv-importer]

Choose the import mode to use

backend = “tidb”

#duplicate-resolution = ‘remove’

Set the temporary storage location for sorted key-value pairs, the target path needs to be an empty directory

sorted-kv-dir = “/dataa/tidba”

[[mydumper.files]]

pattern = ‘(?i)^(?:[^/]/)databasename_.?.tablename...?.csv’

schema = “old_system_data”

table = “trade_his_v2_0”

type = “csv”

[conflict]

strategy = “replace”

threshold = 9223372036854775807

[mydumper]

Source data directory.

data-source-dir = “/data”

Configure wildcard rules, the default rules will filter out all tables under the mysql, sys, INFORMATION_SCHEMA, PERFORMANCE_SCHEMA, METRICS_SCHEMA, INSPECTION_SCHEMA system databases

If this item is not configured, an “unable to find schema” exception will occur when importing system tables

filter = [‘.’, ‘!mysql.', '!sys.’, ‘!INFORMATION_SCHEMA.', '!PERFORMANCE_SCHEMA.’, ‘!METRICS_SCHEMA.', '!INSPECTION_SCHEMA.’]

[tidb]

Information of the target cluster

host = “”

port = 4000

user = “root”

password = “”

Table schema information is obtained from the “status port” of TiDB.

#status-port = 10080

Address of the cluster pd

#pd-addr = “:2379”

| username: 我是人间不清醒 | Original post link

Whether the table structure with data equal to 0 has a primary key.

| username: linuxmysql | Original post link

The data is 0, but the table structure exists.

| username: tidb狂热爱好者 | Original post link

#pattern = ‘(?i)^(?:[^/]*/)trade_..ops_his_v2..[0-9].csv’

| username: tidb狂热爱好者 | Original post link

Here you go. The official documentation is here, and it hasn’t been translated:

| username: 啦啦啦啦啦 | Original post link

Is this all the files that dumpling exported? Also, the sequence numbers are usually continuous, right? Why does it go directly from 0000000000000 to 0000000010000?