Is it possible to provide technical support to recover business key/value data from these bad SST files if a TiKV node fails to start and there are a large number of corrupted SST files in RocksDB?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiKV是使用模式是裸KV,如果有TiKV节点无法启动,其中RocksDB中有大量的损坏的大部分SST文件,有办法提供技术支持从这些Bad SST文件中恢复处业务的key/value数据吗?

| username: TiDBer_X4mIBZTj

  • The usage pattern of TiKV is in the form of a bare KV cluster.
  • There are a large number of damaged RocksDB files in the TiKV nodes.
  • Is there any way to recover the raw keys of the business from these damaged SST files?
| username: TiDBer_jYQINSnf | Original post link

A large amount of damage, it’s uncertain whether rocksdb can even be opened.
Is it rawkv or txnkv? If it’s rawkv and there’s no transaction, can the data read be used?
tikv-ctl ldb scan --from XXXXXX --to XXXXXXXXXXX --db=tikv/db --column_family=‘write’ --hex

This scans the data in writecf. You can replace ‘write’ with ‘default’ or ‘lock’.

If the db cannot be opened, try dumping the sst, but the data from that is almost unusable because lower-level data might overwrite higher-level data. In other words, keys in level6 sst might have been deleted in level0.
./tikv-ctl ldb dump --path=xx.sst

This is the only way to try. If any random data is somewhat useful to you, you can give it a try.

| username: ziptoam | Original post link

Is it possible to recover through WAL?

| username: cchouqiang | Original post link

Refer to this article 专栏 - 使用Online unsafe recovery恢复v6.2同城应急集群 | TiDB 社区
Version v6 has online unsafe recovery, which allows for recovery operations.