How TiFlash Synchronizes Data via Raft and Handles Log Data Before Perlator Transactions are Committed

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiFalsh通过raft同步数据,如何处理perlator事务comited前的日志数据

| username: TiDBer_qYO6BiP6

I am very curious, perlator has two stages, each row of data is divided into three columns, thereby ensuring the implementation of perlator. However, raft synchronizes logs under the transaction, meaning it will synchronize intermediate columns like lock to TiFlash. The row data persisted by TiFlash should not include this information, right? So how is the transaction restoration specifically implemented, and how do you deal with useless dirty logs (do these logs participate in row-to-column conversion)?

As TiFlash serves as the KvEngine for TiKV, data is written as columns encapsulated in Blocks. So, could you please point out the specific logical operations (row-to-column conversion) or the code location for a transaction’s data from TiKV to TiFlash before it is encapsulated into Blocks?

Thanks.

| username: Billmay表妹 | Original post link

You can take a look at this article, I hope it helps you.

TiFlash performs row-to-column conversion during Raft synchronization.

There are two types of data synchronized via Raft:

a. Data snapshots (Snapshot), which contain the entire Region’s data, mainly from newly added TiFlash replicas or data imported using Lightning and other tools into SST.

For data coming from snapshots (such as a 96MB SST Snapshot), TiFlash performs batch row-to-column conversion to generate DTFile files, and then executes a special Ingest Snapshot process.

b. Incremental KV data, mainly from write operations.

For incremental KV data, TiFlash reorganizes it in memory into memory data units based on Blocks, and then executes the standard write process.

| username: Billmay表妹 | Original post link

Here is a comprehensive collection of past TiFlash source code analysis materials, available for download.

Issue 1: Design Ideas of TiFlash Storage Engine

Author: Huang Junshen

Summary: This issue introduces the overall form of TiDB HTAP and provides a detailed analysis of the design ideas for optimizing the storage layer DeltaTree engine and its submodules.

Meeting Materials: TiFlash Storage Layer Overview.pdf (877.2 KB)

Video Replay: Design Ideas of TiFlash Storage Engine_Bilibili

Full Review: Column - TiFlash Source Code Reading (1) TiFlash Storage Layer Overview | TiDB Community

Issue 2: Overview of TiFlash Computing Layer

Author: Xu Fei

Summary: This issue provides an overview of the design principles and code implementation of the TiFlash computing layer.

Meeting Materials: TiFlash Computing Layer Overview - Xu Fei.pdf (1.1 MB)

Video Replay: Source Code Analysis - TiFlash Computing Layer Overview_Bilibili

Full Review: Column - TiFlash Source Code Reading (2) Computing Layer Overview | TiDB Community

Issue 3: Analysis of TiFlash DeltaTree Engine Design and Implementation Part 1

Author: Shi Wenxuan

Summary: This issue provides an in-depth understanding of the principles and workflows related to the write path of the TiFlash storage layer DeltaTree engine.

Meeting Materials: TiFlash DeltaTree Storage Engine (Part 1).pdf (2.2 MB)

Video Replay: Analysis of TiFlash DeltaTree Engine Design and Implementation_Bilibili

Full Review: Column - TiFlash Source Code Reading (3) DeltaTree Storage Engine Design and Implementation Analysis - Part 1 | TiDB Community

Issue 4: Analysis of TiFlash DeltaTree Engine Design and Implementation Part 2

Author: Shi Wenxuan

Summary: This issue provides an in-depth understanding of the read and write workflows and code implementation of the TiFlash storage layer DeltaTree engine.

Meeting Materials: TiFlash DeltaTree Storage Engine (Part 2).pdf (1.2 MB)

Video Replay: Source Code Analysis | TiFlash Storage Layer DeltaTree Engine (Read Path)_Bilibili

Full Review: Column - TiFlash Source Code Reading (5) DeltaTree Storage Engine Design and Implementation Analysis - Part 2 | TiDB Community

Issue 5: Analysis of TiFlash DDL Module Design and Implementation

Author: Hong Yunyan

Summary: This issue provides an understanding of the design ideas and code implementation of the TiFlash DDL module.

Meeting Materials: TiFlash Source Code Analysis - DDL Module(2).pdf (1.5 MB)

Video Replay: Source Code Analysis | TiFlash DDL Module Design and Implementation Analysis_Bilibili

Full Review: Column - TiFlash Source Code Analysis (4) | TiFlash DDL Module Design and Implementation Analysis | TiDB Community

Issue 6: Design and Implementation of Common Operators in TiFlash

Author: Qi Zhi

Summary: This issue provides an understanding of the various stages of TiFlash operators, explaining the design logic of the operator code, enabling further independent code reading or simple issue handling.

Meeting Materials: Design and Implementation of Common Operators in TiFlash.pdf (2.9 MB)

Video Replay: Source Code Analysis | Design and Implementation of Common Operators in TiFlash_Bilibili

Issue 7: Design and Implementation of TiFlash DeltaTree Index

Author: Li Dezhong

Summary: This issue provides an understanding of the role and implementation principles of the core data structure DeltaTree Index in the TiFlash storage layer.

Meeting Materials: Design and Implementation Analysis of TiFlash DeltaTree Index.pdf (1.2 MB)

Video Replay: TiFlash DeltaTree Index_Bilibili

Full Review: Column - Design and Implementation Analysis of TiFlash DeltaTree Index | TiDB Community

Issue 8: Introduction to TiFlash Proxy Module

Author: Luo Rongzhen

Summary: This issue helps understand the principles of the TiFlash Proxy module, how it helps TiFlash obtain data, how it interacts with TiFlash, and the adjustments and optimizations made for TiFlash’s write mode compared to TiKV.

Meeting Materials: TiFlash Source Code Analysis - Proxy Module.pdf (1.6 MB)

Video Replay: Introduction to TiFlash Proxy Module_Bilibili

Full Review: Column - Introduction to TiFlash Proxy Module | TiDB Community

Issue 9: Design and Implementation of TiFlash Expressions

Author: Huang Haisheng

Summary: This issue provides an understanding of the design and source code implementation of TiFlash expressions, aiding in future contributions to TiFlash.

Meeting Materials: Design and Implementation of TiFlash Expressions.pdf (1.8 MB)

Video Replay: Design and Implementation of TiFlash Expressions_Bilibili

Full Review: Column - Design and Implementation of TiFlash Expressions | TiDB Community

| username: 裤衩儿飞上天 | Original post link

This is too comprehensive.

| username: 我是咖啡哥 | Original post link

Saved, awesome! :100:

| username: 我是咖啡哥 | Original post link

Who says the cousin doesn’t understand technology? When it comes to knowing the community, if the cousin claims to be second, who dares to claim to be first? :smile:

| username: 特雷西-迈克-格雷迪 | Original post link

Awesome, I’ll bookmark it and read it slowly.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.