Course Notes: Getting Started with TiDB (Part 1)

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 课程笔记 TiDB 快速起步(上)

| username: 云散月明

Course Certificate

Course Link

TiDB Quick Start

Course Outline

  • A Brief History of Databases, Big Data, and TiDB

    • 01: History and Trends in Database and Big Data Development

    • 02: Development of Distributed Relational Databases

    • 03: Evolution of TiDB Products and Open Source Community

  • Overview of TiDB

    • 04: What Kind of Database Do We Really Need?

    • 05: How to Build a Distributed Storage System

    • 06: How to Build a Distributed SQL Engine

  • Next-Generation HTAP Database Selection

    • 07: HTAP Databases Based on Distributed Architecture

    • 08: Key Technological Innovations in TiDB

    • 09: Typical Application Scenarios and User Cases of TiDB

  • First Experience with TiDB

    • 10: First Experience with TiDB

Course Notes (Part 1)

  1. Understanding database development trends from multiple perspectives such as time, data volume, and architectural evolution

  2. Understanding that distributed relational databases are future-oriented databases

  3. Intrinsic drivers of database technology development: business growth (data volume), scenario innovation (data model and interaction efficiency), hardware and cloud computing development

  4. Database architecture: single node, shared state, distributed

  5. RDBMS → NoSQL (Not only SQL) → NewSQL → HTAP

  6. Segmentation of data technology and integration of data services

  7. Trade-Off (choices and balances)

  8. 1965, Gordon Moore, the number of transistors on an integrated circuit doubles approximately every 18 months.

  9. 2006, Google’s GFS, Bigtable, MapReduce

  10. Divide and conquer in distributed systems

  11. Main challenges of distributed technology

    1. How to maximize divide and conquer

    2. How to achieve global consistency

    3. How to handle fault tolerance and partial failures

    4. How to deal with unreliable networks and network partitions

  12. CAP Theorem

    1. C Consistency

    2. A Availability

    3. P Partition Tolerance

  13. Relational model and transactions

    1. A Atomicity

    2. C Consistency

    3. I Isolation

    4. D Durability

  14. NewSQL (natively distributed relational database) = distributed system + SQL + transactions

  15. 2013, Google, Spanner paper, F1 paper

  16. 2014, Raft paper implementing industrial-grade distributed consistency protocol

  17. TiDB, open-source, natively distributed relational database, HTAP

  18. Open source: a best path to success for foundational software (tends to be general and standardized)

  19. Open source: open source code, open attitude, open source ecosystem governance

  20. Designing a distributed relational database

    1. Scalability (elastic, write-oriented)

    2. Strong consistency, high availability (RPO=0, RTO small enough)

    3. Standard SQL supporting ACID transactions

    4. Cloud-native

    5. HTAP (integration of OLAP and OLTP under massive data, hybrid row-column)

    6. Compatibility with mainstream ecosystems and protocols

  21. Common foundational factors in the data technology stack

    1. Data model

    2. Data storage and retrieval structure

    3. Data format

    4. Storage engine

    5. Replication protocol

    6. Distributed transaction model

    7. Data architecture

    8. Optimizer algorithm

    9. Execution engine

    10. Computing engine

  22. Development of hardware, especially networks, has driven the separation of computing and storage architecture

  23. Highly layered architecture of TiDB

| username: ShawnYan | Original post link

:+1:

| username: tidb狂热爱好者 | Original post link

Is there an exam for this as well?