In storage, does the Scheduler convert data into a snapshot first when writing data?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: storage中,Scheduler 在实现数据写时,是先将数据转换成快照么

| username: Lystorm

When reading the TiKV source code documentation (TiKV 源码解析系列文章(十一)Storage - 事务控制层 | PingCAP), there is a piece of logic in the article explaining the storage part.


Does the request directly request snapshot information after reaching storage and obtaining the latch? This part is not very clear.
After reading the scheduler part of the source code, it seems that after receiving the request, it first establishes the key, obtains the value, and then converts the data into a snapshot structure?

This is my first attempt to read the TiKV source code, and I am not very clear about the overall structure. Seeking help for clarification.

| username: TiDBer_jYQINSnf | Original post link

Prewrite involves acquiring locks, right? The locks in RocksDB are the data in the lock column family. This means it needs to read from RocksDB.

This reading is done using a snapshot.

After obtaining a snapshot, subsequent reads of the term and lock use this snapshot. Then, the prewrite data is organized and written down.

This is my personal understanding. If there are any mistakes, experts are welcome to correct me. :blush:

| username: pingyu | Original post link

SCHED_STAGE_COUNTER_VEC.get(tag).snapshot_ok.inc() is a metric statistic corresponding to snapshot_ok in Scheduler stage total in the Scheduler. The tag here is the request’s tag (for example, the tag for a prewrite request is “prewrite”), and it is not the key of the data.

The term in let term = snapshot.ext().get_term() is the term in Raft, which is obtained and stored in task.cmd.ctx. This term will be used later to check if a leader transfer has occurred, for example here (although I’m not very familiar with this part, it seems to be related to memory pessimistic locks).

| username: Lystorm | Original post link

I think your understanding is correct. It only mentions obtaining based on snapshot, but if there is no snapshot and it’s just a regular raft message, how is this lock obtained? I don’t seem to see the logic explained in this part.

| username: Lystorm | Original post link

Is the requested tag referring to the raft details included in the request (such as leader term, region id, cluster id, etc.)?

| username: TiDBer_jYQINSnf | Original post link

RaftMessage is the subsequent logic, while snapshot is a layer above it. Raft is just for achieving consensus among multiple nodes and ensuring no data loss. What lock does RaftMessage acquire?

| username: Lystorm | Original post link

Does it mean that the upper-level interaction uses snapshots? The reason I have doubts about this is mainly because when I was looking at the Raft part, at the initial establishment of the cluster, there should be no snapshots. Snapshots are only triggered when the data reaches a certain threshold.

| username: TiDBer_jYQINSnf | Original post link

There are two types of snapshots in TiKV:

  1. When reading data, a snapshot is used to read the data.
  2. When adding a new node to the raft, a snapshot of a region is sent to the new node to help it catch up with the data quickly without occupying the raft message channel.
    Are you possibly referring to the second type?
| username: Lystorm | Original post link

My point of confusion is with the first one:
Are all data read operations in TiKV based on snapshots?

| username: TiDBer_jYQINSnf | Original post link

From the code I reviewed, yes, it first obtains a snapshot before proceeding with the subsequent processes. I’m not sure if all requests follow this pattern.

| username: Lystorm | Original post link

Yes, I also feel that they all first obtain a snapshot and then proceed with the subsequent steps. Therefore, I have some doubts about the non-snapshot part and cannot be sure if all requests are like this.

| username: pingyu | Original post link

No, the tag is used to identify the current type of RPC request and has nothing to do with Raft.

For prewrite requests, the tag is “prewrite.”

See here.

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.