Raft Log Cleanup Mechanism

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: raft log 日志清理机制

| username: 胡杨树旁

This part is the log written before the data is persisted. I would like to ask what the cleanup mechanism for this part of the log file is and which parameters control it.

| username: dba远航 | Original post link

In MySQL, the expire_logs_days parameter is used for control.

| username: 随缘天空 | Original post link

The main configurations are as follows:

For more details, you can refer to the following link: TiKV 配置文件描述 | PingCAP 文档中心

I think if the number of remaining Raft logs or the size of the remaining Raft logs exceeds the allowed threshold, the cluster might start the cleanup operation.

| username: heiwandou | Original post link

Mainly the log size and days

| username: 胡杨树旁 | Original post link

These parameters are all default values and have not been modified, but I still don’t quite understand which one takes precedence or if there is an internal cleanup mechanism.

| username: swino | Original post link

TiKV provides a mechanism for cleaning up Raft Logs, and the specific process is as follows:

  1. Each TiKV instance calculates the threshold for log reduction based on configured parameters to determine whether Raft Logs need to be cleaned.
  2. It checks whether the difference between the committed log index and the compacted index meets the minimum requirement. If not, the Raft Log cleanup process is temporarily halted.
  3. If there is room for Raft Log reduction, some data structures, such as RocksDB, can be used to manage the Raft Logs.
  4. The Raft Log reduction strategy calculates the overlap of log_storage_size for all peers within a certain time window, selects the minimum value for compression cleanup, and deletes the already compressed Raft Logs.
  5. After compressing and cleaning up the Raft Logs, the peer_cache and region_cache within the TiKV instance are synchronously updated to ensure the consistency and reliability of Raft data.
| username: andone | Original post link

Log size and days

| username: 随缘天空 | Original post link

First, it is certain that the data is written to the disk successfully from the memory before the log cleanup is initiated. For the default configuration items, the cleanup process will start as long as one of the conditions is met.

| username: 胡杨树旁 | Original post link

Okay, I roughly understand. Thank you.

| username: oceanzhang | Original post link

I haven’t learned this part yet, so I’ll bookmark it for now and study it later.

| username: TiDBer_小阿飞 | Original post link

Cleaning will start as soon as one of the parameter settings is met.