Issues with tikv-importer

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv-importer的问题

| username: 我家有个臭皮崽

[Requirement]:
Currently using a TiKV cluster as KV storage and want to implement an offline data import function. According to the documentation, TiKV can import SST files through tikv-importer. BTW: The current client is a C++ client implemented based on protobuf.
[Existing Information]
GitHub - tikv/importer: tikv-importer is a front-end to help ingesting large number of KV pairs into a TiKV cluster - tikv-importer tool
GitHub - pingcap/kvproto: Protocol buffer files for TiKV - protobuf definitions for importer-kv and sst-importer
[Question]:
How to use tikv-importer as a data import component?

I am confused after reading the Lightning documentation. Is it possible if the Lightning tool and the TiKV cluster are not on the same machine?

Can the following deployment method be achieved using tikv-importer?

Machine 1 ------------------- Machine 2 -------------------------------------- Machine 3
importer-client ---------> tikv-importer Server ---------------> tikv cluster

| username: Billmay表妹 | Original post link

If you want to use tikv-importer to import SST files, you can follow these steps:

  1. Deploy the tikv-importer component. You can refer to the official documentation of tikv-importer [1].

  2. Prepare the SST files. You can use the TiDB Lightning tool to generate SST files, or use other tools to generate SST files.

  3. Use a client to send an import request to tikv-importer. The request should include the path of the SST file and the information of the target table. The specific request format can be found in the importer-kv.proto and sst-importer.proto files in kvproto [2].

  4. After receiving the import request, tikv-importer will import the data from the SST file into the tikv cluster.

Regarding the difference between the Lightning tool and tikv-importer, the Lightning tool is an official data import tool provided by TiDB, which supports importing data from MySQL, TiDB, TiKV, and other data sources. On the other hand, tikv-importer is an independent component specifically used to import SST files into the tikv cluster. You can refer to the official documentation [1].

As for whether the Lightning tool and tikv-importer can be deployed on different machines, the answer is yes. Both the Lightning tool and tikv-importer can be deployed on any machine as long as they can access the TiDB and tikv clusters.

Finally, regarding the deployment method you mentioned, i.e., client → tikv-importer Server → tikv cluster, it is feasible. The client can send an import request to tikv-importer, and after receiving the request, tikv-importer will import the SST file into the tikv cluster. The specific implementation can be found in the importer-kv.proto and sst-importer.proto files in kvproto [2].

| username: 胡杨树旁 | Original post link

Could you please tell me how the TiDB Lightning tool generates SST files? Is this covered in the official documentation?

| username: Billmay表妹 | Original post link

The TiDB Lightning tool can export data from TiDB into SST files. The specific steps are as follows:

  1. Configure the TiDB Lightning tool:

    Example configuration file:

    [lightning]
    # Path to the lightning log file
    log-file = "/path/to/lightning.log"
    # Path to the lightning progress file
    progress-file = "/path/to/lightning.progress"
    # Path to the lightning configuration file
    config-file = "/path/to/lightning.toml"
    
    [tidb]
    # TiDB address
    host = "127.0.0.1"
    # TiDB port number
    port = 4000
    # TiDB username
    user = "root"
    # TiDB password
    password = ""
    
    [mydumper]
    # Path to the mydumper export files
    data-source-dir = "/path/to/mydumper"
    # Number of mydumper threads
    threads = 16
    
    [myloader]
    # Path to the myloader import files
    data-source-dir = "/path/to/myloader"
    # Number of myloader threads
    threads = 16
    

    The [tidb] section contains the TiDB connection information, the [mydumper] section contains the export data configuration, and the [myloader] section contains the import data configuration.

  2. Export TiDB data:

    Use the TiDB Lightning tool to export TiDB data, including SST files:

    tiup tidb-lightning \
        --config /path/to/lightning.toml \
        --tidb-host 127.0.0.1 \
        --tidb-port 4000 \
        --tidb-user root \
        --tidb-password "" \
        --backend local \
        --enable-checkpoint=false \
        --log-file /path/to/lightning.log \
        --progress-file /path/to/lightning.progress
    

    Here, --backend local indicates that the data will be exported to local files, --enable-checkpoint=false indicates that checkpoints are not enabled, and --log-file and --progress-file specify the paths to the log file and progress file, respectively.

    The exported SST files will be stored in the directory specified by the --mydumper-data-dir parameter.

| username: 胡杨树旁 | Original post link

Okay, thank you. I always thought Lightning could only import data into the TiDB cluster. I didn’t know Lightning could also export data from TiDB. That’s amazing, I’ll give it a try. :+1:

| username: redgame | Original post link

It is definitely possible.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.