Uses of SST Snapshot

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: SST Snapshot用途

| username: TiDBer_h9um7nOg

【TiDB Usage Environment】Production environment or Test environment or POC
【TiDB Version】
【Encountered Issue】
In what scenarios is the SST Snapshot mainly applied or generated?
【Reproduction Path】What operations were performed that caused the issue
【Problem Phenomenon and Impact】

【Attachments】

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

| username: HACK | Original post link

In Raft, if a Follower lags too far behind the Leader, the Leader may directly send a snapshot to the Follower. In TiKV, PD sometimes also schedules some replicas within a Raft Group to other machines. All of these involve handling snapshots.

In the current implementation, a snapshot process is as follows:

  1. The Leader scans all the data of a region and generates a snapshot file.
  2. The Leader sends the snapshot file to the Follower.
  3. The Follower receives the snapshot file, reads it, and writes it into RocksDB in batches.

If a node has multiple Raft Group Followers processing snapshot files simultaneously, the write pressure on RocksDB will be very high, which can easily cause RocksDB to experience overall write slowdowns or stalls due to compaction processing.

Fortunately, RocksDB provides an SST mechanism. We can directly generate an SST snapshot file, and the Follower can use the ingest interface to directly load the SST file into RocksDB.

| username: 啦啦啦啦啦 | Original post link

You can refer to this

| username: Raymond | Original post link

If there are relatively few regions on a certain TiKV, PD will schedule regions from other TiKV nodes to this TiKV. The scheduling strategy generally involves first sending a snapshot of the region (copying the state of the region at a certain point in time) to this TiKV, and then synchronizing the incremental data through the Raft log. This is similar to setting up MySQL master-slave replication, where the slave is first built by fully copying the data from the master, and then the slave synchronizes data through the binlog.

| username: system | Original post link

This topic was automatically closed 1 minute after the last reply. No new replies are allowed.