What is the underlying storage engine in TiKV?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiKV中的底层存储引擎是什么

| username: TiDB学习小白

When browsing TiDB’s official documentation, I learned that TiKV’s underlying storage engine is RocksDB (“RocksDB is the core storage engine of TiKV, used to store Raft logs and user data. Each TiKV instance has two RocksDB instances, one for storing Raft logs (commonly referred to as raftdb) and the other for storing user data and MVCC information (commonly referred to as kvdb)”). However, in subsequent studies, I learned that TiKV also has other storage engines, such as raft-engine. So what exactly is the default storage engine of TiKV? Can someone answer this? Thanks.

| username: yytest | Original post link

TiKV indeed primarily relies on RocksDB as its core storage engine for storing Raft logs and user data. As described in the official TiDB documentation you referenced, each TiKV instance contains two RocksDB instances—raftdb and kvdb, which are used to store Raft logs and user data along with MVCC information, respectively.

Regarding raft-engine, it is a new storage engine introduced by TiKV aimed at optimizing the storage of Raft logs. The design goal of raft-engine is to reduce write amplification and improve performance, especially for the Raft log part. By storing Raft logs on a separate, more efficient storage medium, it alleviates the burden on RocksDB and enhances the overall write performance of the system.

Although raft-engine provides performance advantages, it is not intended to completely replace RocksDB, at least not in the current version. RocksDB remains the primary storage engine for TiKV, used for storing user data and MVCC information, while raft-engine is specifically for optimizing the storage of Raft logs. Therefore, it can be said that the default storage engine for TiKV is RocksDB, with raft-engine serving as a supplementary storage component to enhance performance in specific areas.

| username: YuchongXU | Original post link

RocksDB

| username: tidb菜鸟一只 | Original post link

The default storage is RocksDB. RocksDB is a persistent storage engine used to store persistent data on a single node. Raft is a distributed consensus algorithm used to ensure the consistency and reliability of data replicas in a TiDB cluster.

| username: yytest | Original post link

The default storage engine for TiKV is indeed RocksDB. As you have learned from the official TiDB documentation, TiKV uses RocksDB as its core storage engine to store Raft logs and user data. Each TiKV instance contains two RocksDB instances: one for storing Raft logs (called raftdb) and another for storing user data and MVCC information (called kvdb).

However, to improve performance and optimize storage space, TiKV introduced raft-engine as a new storage engine specifically for storing Raft logs. Raft-engine is designed to replace the part of RocksDB that stores Raft logs, thereby reducing write amplification and improving write performance. Starting from a certain version of TiKV (the specific version number may need to be checked in TiKV’s release notes or official documentation), raft-engine was introduced as an optional storage engine for storing Raft logs.

Despite the introduction of raft-engine, it did not immediately replace RocksDB as the default storage engine. In some versions of TiKV, raft-engine exists as an experimental feature that needs to be explicitly enabled through configuration options when starting TiKV. Therefore, if you do not specifically specify the use of raft-engine, TiKV will continue to use RocksDB as the default storage engine.

| username: TiDBer_QYr0vohO | Original post link

RocksDB

| username: TiDB学习小白 | Original post link

Thank you for your reply. Do you mean that by default, two RocksDB instances are still used to store raft logs and user data separately? Because when I previously used tiup to set up and deploy a TiDB cluster, in the data directory tidb-data, I found that TiKV has two directories: one is db storing SST files, and the other is raft-engine storing raftLog formatted data. The version I set up is v7.5.0, so I find it a bit strange. The data directory structure is as follows:

| username: TiDB学习小白 | Original post link

Okay, thank you.

| username: TiDB学习小白 | Original post link

Okay, thank you!

| username: TiDB学习小白 | Original post link

Okay, I understand.

| username: zhanggame1 | Original post link

The documentation and video courses here are a bit outdated. Raft-engine has been the default for a while now.

| username: TiDB学习小白 | Original post link

Okay, thank you.

| username: 鱼跃龙门 | Original post link

The default storage is RocksDB. RocksDB raft stores raft logs, and RocksDB KV stores key-value pairs, serving as a persistent storage engine. Raft is a distributed consensus algorithm that replicates data to Follower nodes, ensuring the consistency and reliability of data replicas in the TiDB cluster.

| username: 濱崎悟空 | Original post link

You can check out the RocksDB 101 course~

| username: ziptoam | Original post link

TiKV uses RocksDB, while TiFlash uses column storage and resides in memory.