TiKV Data Backup Consultation

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiKV 数据备份咨询

| username: TiDBer_OjUSLomJ

Currently, TiKV itself provides data backup and restore tools through tikv br, and TiDB’s br also experimentally provides TiKV backup functionality.

Based on the scenario of using TiKV independently without TiDB, I would like to inquire about the following questions:

  • The TiKV br manual mentions that it is a “RawKV” backup. If interacting with TiKV using the txnKV interface, is it feasible to use RawKV for backup and restore? Will the transaction information related to txnKV be affected?
  • The experimental br backup raw command is provided in TiDB br. Is there any difference between this command and TiKV’s br?
  • The experimental br backup txn command is provided in TiDB br. Can this command be used for TiKV’s own backup?
| username: 有猫万事足 | Original post link

These issues are all related to the setting of the api-version parameter.

* The TiKV BR manual mentions that it is a “RawKV” backup. If you use the txnKV interface to interact with TiKV, is it feasible to use RawKV for backup and recovery? Will the transaction information related to txnKV be affected?

api-version=1 is feasible, api-version=2 is not feasible, because when api-version=2, the documentation clearly states that data is divided into ranges based on usage, supporting the coexistence of a single cluster TiDB, transactional KV, and RawKV applications.

* The tidb br command provides an experimental br backup raw command. Is there any difference between this command and tikv’s br?

Directly comparing the code, I roughly looked at it, and it is almost identical.

This is tikv-br

This is br backup raw. You can see that apart from the different URLs, the code inside is almost the same. However, to be safe, you should compare it yourself.

* The tidb br command provides an experimental br backup txn command. Can this command be used for TiKV’s own backup?

I can’t find this br backup txn command on the master branch. I can’t find it in the code either. I don’t know where you heard about it.
Even if it exists, according to the api-version description, when api-version=1, it is definitely possible. If it is 2, it is likely not feasible.

| username: wangkk2024 | Original post link

Impressive, you all read the source code.

| username: TiDBer_HVBE81bh | Original post link

    My usage scenario is similar to yours. I use TiKV and PD to form a KV database cluster, and the upper-layer application uses the txnKV interface to write data. My version is 6.5.6. I looked up the corresponding documentation and tested it. Using the RawKV BR tool cannot back up and restore txnKV data.

    In TiKV 7.1 and above, the txnKV backup function is provided, but it is an experimental feature. After my testing, the backup function is usable but not very complete. I did not verify whether the transaction information related to txnKV would be affected during the backup. The backup and restore tool needs to be installed using the tiup tool.

Backup and restore tool installation: tiup install br:v7.1.5

Test environment description: Version 7.1.5
Server 1 → ip: 10.202.37.139 running TiKV, PD processes (leader)
Server 2 → ip: 10.202.41.98 running TiKV, PD processes
Server 3 → ip: 10.202.41.18 running TiKV, PD processes

You can use the command to see which PD is the leader: pd-ctl -u “http://10.202.37.139:2379” member

Backup
Execute the backup command on the leader PD


In the command, the PD address can specify any PD in the cluster. I am backing up to the local file system, and all TiKV nodes in the cluster must have this backup path. After the backup, two types of files will be generated in the backup path: one is the metadata file: backupmeta, and the other is the actual backup data: 1 (maybe because my database data is small, only one data directory is generated). Perhaps because the txnKV backup is an experimental feature, the metadata file will only be generated on the machine executing the backup command, and the backup data will only be generated on the leader PD. In my experiment, since the server executing the backup command is also the leader PD, both types of files are in that backup path, but no files are generated in the backup paths of the other two servers.

Restore
tiup br restore txn --pd “10.202.37.139:2379” --storage “local:///root/bak”

  • You can specify any PD in the cluster
  • Data can only be restored to a newly deployed TiKV cluster
  • The server executing the restore command needs to have the metadata file: backupmeta, and all TiKV nodes need to have the actual backup data. Since the backup data is only generated on the leader PD node, you need to copy the data to the backup paths of the other TiKV nodes.