Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tikv-importer的问题
[Requirement]:
Currently using a TiKV cluster as KV storage and want to implement an offline data import function. According to the documentation, TiKV can import SST files through tikv-importer. BTW: The current client is a C++ client implemented based on protobuf.
[Existing Information]
GitHub - tikv/importer: tikv-importer is a front-end to help ingesting large number of KV pairs into a TiKV cluster - tikv-importer tool
GitHub - pingcap/kvproto: Protocol buffer files for TiKV - protobuf definitions for importer-kv and sst-importer
[Question]:
How to use tikv-importer as a data import component?
I am confused after reading the Lightning documentation. Is it possible if the Lightning tool and the TiKV cluster are not on the same machine?
Can the following deployment method be achieved using tikv-importer?
Machine 1 ------------------- Machine 2 -------------------------------------- Machine 3
importer-client ---------> tikv-importer Server ---------------> tikv cluster
If you want to use tikv-importer to import SST files, you can follow these steps:
-
Deploy the tikv-importer component. You can refer to the official documentation of tikv-importer [1].
-
Prepare the SST files. You can use the TiDB Lightning tool to generate SST files, or use other tools to generate SST files.
-
Use a client to send an import request to tikv-importer. The request should include the path of the SST file and the information of the target table. The specific request format can be found in the importer-kv.proto and sst-importer.proto files in kvproto [2].
-
After receiving the import request, tikv-importer will import the data from the SST file into the tikv cluster.
Regarding the difference between the Lightning tool and tikv-importer, the Lightning tool is an official data import tool provided by TiDB, which supports importing data from MySQL, TiDB, TiKV, and other data sources. On the other hand, tikv-importer is an independent component specifically used to import SST files into the tikv cluster. You can refer to the official documentation [1].
As for whether the Lightning tool and tikv-importer can be deployed on different machines, the answer is yes. Both the Lightning tool and tikv-importer can be deployed on any machine as long as they can access the TiDB and tikv clusters.
Finally, regarding the deployment method you mentioned, i.e., client → tikv-importer Server → tikv cluster, it is feasible. The client can send an import request to tikv-importer, and after receiving the request, tikv-importer will import the SST file into the tikv cluster. The specific implementation can be found in the importer-kv.proto and sst-importer.proto files in kvproto [2].
Could you please tell me how the TiDB Lightning tool generates SST files? Is this covered in the official documentation?
The TiDB Lightning tool can export data from TiDB into SST files. The specific steps are as follows:
-
Configure the TiDB Lightning tool:
Example configuration file:
[lightning]
# Path to the lightning log file
log-file = "/path/to/lightning.log"
# Path to the lightning progress file
progress-file = "/path/to/lightning.progress"
# Path to the lightning configuration file
config-file = "/path/to/lightning.toml"
[tidb]
# TiDB address
host = "127.0.0.1"
# TiDB port number
port = 4000
# TiDB username
user = "root"
# TiDB password
password = ""
[mydumper]
# Path to the mydumper export files
data-source-dir = "/path/to/mydumper"
# Number of mydumper threads
threads = 16
[myloader]
# Path to the myloader import files
data-source-dir = "/path/to/myloader"
# Number of myloader threads
threads = 16
The [tidb]
section contains the TiDB connection information, the [mydumper]
section contains the export data configuration, and the [myloader]
section contains the import data configuration.
-
Export TiDB data:
Use the TiDB Lightning tool to export TiDB data, including SST files:
tiup tidb-lightning \
--config /path/to/lightning.toml \
--tidb-host 127.0.0.1 \
--tidb-port 4000 \
--tidb-user root \
--tidb-password "" \
--backend local \
--enable-checkpoint=false \
--log-file /path/to/lightning.log \
--progress-file /path/to/lightning.progress
Here, --backend local
indicates that the data will be exported to local files, --enable-checkpoint=false
indicates that checkpoints are not enabled, and --log-file
and --progress-file
specify the paths to the log file and progress file, respectively.
The exported SST files will be stored in the directory specified by the --mydumper-data-dir
parameter.
Okay, thank you. I always thought Lightning could only import data into the TiDB cluster. I didn’t know Lightning could also export data from TiDB. That’s amazing, I’ll give it a try. 
It is definitely possible.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.