Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: PCTA学习笔记

Course Name: Core Principles and Architecture of TiDB Database (101)
Study Duration:
500 minutes
Course Gains:
Architecture of various TiDB components
Course Content:
What kind of database do we need?
-
Scalability
-
Strong consistency and high availability: RPO (Recovery Point Objective) data loss amount, RTO (Recovery Time Objective) system recovery time. RPO=0, RTO sufficiently small
-
Standard SQL and ACID transactions
-
Cloud-native
-
HTAP (Hybrid Transactional/Analytical Processing)
-
Compatibility with mainstream ecosystems and protocols
Common foundational factors in the data technology stack:
-
Data model
-
Data storage and retrieval
-
Data format
-
Storage engine
-
Replication protocol
-
Distributed transaction model
-
Data architecture
-
Optimizer algorithms
-
Execution engine
Separation of computation and storage
Hardware, especially network development, has driven the separation of computation and storage.
What kind of storage engine do we need?
-
Finer-grained elastic scalability
-
High concurrent read and write
-
No data loss or errors
-
Multiple replicas to ensure consistency and high availability
-
Support for distributed transactions
Choosing data structures
LSM-tree: Using space to trade for write latency, replacing random writes with sequential writes
Writing is the most expensive cost
LSM-tree supports batch writing and lock-free snapshots
Data replication
Raft
How to achieve scalability
Pre-sharding (static sharding) or auto-sharding (dynamic sharding): use auto-sharding
Sharding algorithms: hash, range, list: use range
Region is a range shard
PD balances the scheduling of regions
Distributed transactions: decentralized two-phase commit
TiDB transaction isolation: SI (Snapshot Isolation), RR supports RC
Default optimistic locking
How to build a distributed SQL engine
Small table cache
Data volume not large, below 64MB
Read-only or infrequently modified
Table accessed very frequently
Lease
tidb_table_cache_lease = 5 seconds
Cannot write before lease expires, writing will block
Enabling small table cache prevents DDL operations, mutual exclusion
First write to WAL, then to memtable.
When write_buffer_size is reached, memtable will convert to immutable memtable
Immutable memtable prevents write blocking when memtable directly converts to SST
If there is one immutable memtable, it will flush to disk. If there are five without persisting to SST, it will cause write blocking (write stall) flow control
For data less than 255 bytes, directly write to the write column
Issues or Extended Thoughts During Learning:
- Issue 1: Difference between clustered and non-clustered indexes
Using MySQL as an example, a clustered index is an index sorted by the primary key of the table, and a table can only have one clustered index. The data storage method of a clustered index in the table is to store the entire table’s data in a B+ tree and sort it by the primary key.
A non-clustered index is an index sorted by non-primary key fields, and a table can have multiple non-clustered indexes. The data storage method of a non-clustered index in the table is to store the value of the index field and the address of the data row in a B+ tree.
Therefore, if querying data by non-primary key fields, you can locate the disk page of the data row that meets the conditions through the non-clustered index and then directly retrieve the data of that row, thus being relatively efficient.