When TiKV reads data, does it first read from the block cache or the memtable?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv读取数据时候,是先读block cache,还是先读取memtable?

| username: yytest

When TiKV reads data, does it first read from the block cache or the memtable? How is the data stored in the block flushed? Is there an official explanation for these?

| username: FutureDB | Original post link

In TiKV, the process of reading data first checks the block cache and then the memtable. This is because TiKV uses RocksDB as its storage engine, and the read process in RocksDB is as follows:

  1. Block Cache: When a read request arrives, RocksDB first checks the block cache (if caching is enabled). The block cache is used to store data blocks read from the disk, which can significantly reduce disk access and improve read performance. If the requested data is found in the block cache, it can be returned directly without further disk access.

  2. Memtable: If the data is not in the block cache, RocksDB next checks the memtable. The memtable is an in-memory data structure that stores recently written data. If the data is found in the memtable, it can be returned directly.

  3. SSTables: If the data is neither in the block cache nor in the memtable, RocksDB will continue to search for the data in the SSTables on disk. SSTables are stored in layers, and RocksDB will search from the newest layer downwards until the data is found or all layers have been checked.

Therefore, for data reading in TiKV, the priority is: Block Cache → Memtable → SSTables. This design maximizes read efficiency and reduces disk I/O operations.

| username: yytest | Original post link

Thank you for your patient explanation.

| username: TiDBer_QYr0vohO | Original post link

First block cache, then memtable.

| username: Kongdom | Original post link

You can refer to this column: 专栏 - PCTP考试学习笔记之一:深入TIDB体系架构(下) | TiDB 社区

| username: 濱崎悟空 | Original post link

Block cache

| username: zhaokede | Original post link

When reading data, TiKV first checks the Block Cache and MemTable to ensure quick access to the most recent and frequently read data. The data in the Block Cache is dynamically managed and will be removed or flushed to disk based on the cache replacement policy. These mechanisms collectively ensure that TiKV can efficiently handle a large number of read and write requests, while providing high availability and data durability guarantees.

| username: ziptoam | Original post link

Block Cache is automatic, while MemTable is manually set for hot small tables, so it should be Block Cache first.