Course Notes: Getting Started with TiDB (Part 1)

translator_bot · June 23, 2024, 12:10pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 课程笔记 TiDB 快速起步（上）

| username: 云散月明

Course Certificate

Course Link

TiDB Quick Start

Course Outline

A Brief History of Databases, Big Data, and TiDB
- 01: History and Trends in Database and Big Data Development
- 02: Development of Distributed Relational Databases
- 03: Evolution of TiDB Products and Open Source Community
Overview of TiDB
- 04: What Kind of Database Do We Really Need?
- 05: How to Build a Distributed Storage System
- 06: How to Build a Distributed SQL Engine
Next-Generation HTAP Database Selection
- 07: HTAP Databases Based on Distributed Architecture
- 08: Key Technological Innovations in TiDB
- 09: Typical Application Scenarios and User Cases of TiDB
First Experience with TiDB
- 10: First Experience with TiDB

Course Notes (Part 1)

Understanding database development trends from multiple perspectives such as time, data volume, and architectural evolution
Understanding that distributed relational databases are future-oriented databases
Intrinsic drivers of database technology development: business growth (data volume), scenario innovation (data model and interaction efficiency), hardware and cloud computing development
Database architecture: single node, shared state, distributed
RDBMS → NoSQL (Not only SQL) → NewSQL → HTAP
Segmentation of data technology and integration of data services
Trade-Off (choices and balances)
1965, Gordon Moore, the number of transistors on an integrated circuit doubles approximately every 18 months.
2006, Google’s GFS, Bigtable, MapReduce
Divide and conquer in distributed systems
Main challenges of distributed technology
1. How to maximize divide and conquer
2. How to achieve global consistency
3. How to handle fault tolerance and partial failures
4. How to deal with unreliable networks and network partitions
CAP Theorem
1. C Consistency
2. A Availability
3. P Partition Tolerance
Relational model and transactions
1. A Atomicity
2. C Consistency
3. I Isolation
4. D Durability
NewSQL (natively distributed relational database) = distributed system + SQL + transactions
2013, Google, Spanner paper, F1 paper
2014, Raft paper implementing industrial-grade distributed consistency protocol
TiDB, open-source, natively distributed relational database, HTAP
Open source: a best path to success for foundational software (tends to be general and standardized)
Open source: open source code, open attitude, open source ecosystem governance
Designing a distributed relational database
1. Scalability (elastic, write-oriented)
2. Strong consistency, high availability (RPO=0, RTO small enough)
3. Standard SQL supporting ACID transactions
4. Cloud-native
5. HTAP (integration of OLAP and OLTP under massive data, hybrid row-column)
6. Compatibility with mainstream ecosystems and protocols
Common foundational factors in the data technology stack
1. Data model
2. Data storage and retrieval structure
3. Data format
4. Storage engine
5. Replication protocol
6. Distributed transaction model
7. Data architecture
8. Optimizer algorithm
9. Execution engine
10. Computing engine
Development of hardware, especially networks, has driven the separation of computing and storage architecture
Highly layered architecture of TiDB

translator_bot · June 23, 2024, 12:10pm

| username: ShawnYan | Original post link

translator_bot · June 23, 2024, 12:10pm

| username: tidb狂热爱好者 | Original post link

Is there an exam for this as well?