The storage file density of the TiDB database is too low and the volume is too large

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB 数据库的存储文件密度太低和体积过大

| username: TiDBer_BVAsMOX5

【TiDB Usage Environment】Test Environment

【TiDB Version】v6.1.0

【Encountered Problem】
In the TiDB database, importing 1 million records caused the file system usage to increase by 20GB, which is a significant discrepancy between the data volume and the corresponding file size. For the same data volume, the MySQL database file size is only 1.5GB.

【Reproduction Path】Operations performed that led to the problem

【Problem Phenomenon and Impact】
In the TiDB database, importing 1 million records caused the file system usage to increase by 20GB, which is a significant discrepancy between the data volume and the corresponding file size. For the same data volume, the MySQL database file size is only 1.5GB. The community version of TiDB uses the file system to store data, and the file storage density is too low, causing the file size to grow too quickly.

【Attachments】
TiDB single-node pseudo-cluster, data file system usage size

TiDB single-node pseudo-cluster, actual inserted data rows 1 million
Image

Table structure, very few fields, and the size of each row is not large

TiDB single-node pseudo-cluster, information as follows

| username: xfworld | Original post link

  1. MySQL is single-replica, TiDB is multi-replica. How many replicas did you configure for your table?
  2. Data has a compression ratio. You can configure the compression capability according to the resource calculation capacity.
  3. The applicable scenarios are different, so this comparison is not very meaningful. If it’s a single-machine environment, you can just use MySQL.
| username: ddhe9527 | Original post link

First, you need to consider multiple replicas. For example, with the default of 3 replicas, you need to triple the space. Then, you need to consider MVCC (Multi-Version Concurrency Control). Unlike MySQL, which uses undo logs to implement this, in TiDB, different versions of the same row of data are stored as different KV pairs in RocksDB. These need to wait for GC (Garbage Collection) to delete old snapshot data, after which the space will decrease. Finally, you also need to consider the space amplification of RocksDB’s 7-level LSM Tree structure, which is approximately 1.12 times.

| username: TiDBer_BVAsMOX5 | Original post link

Thank you for your answer. Is there any official documentation on how to optimize the compression ratio of TiDB database files and reduce storage?

Because with 1 million data entries, the storage file system takes up 20GB, which is really unusable in a production environment.

I followed the official deployment documentation. Is there anything I might have missed?

| username: TiDBer_BVAsMOX5 | Original post link

According to the official documentation, the configuration is 1PD + 3TiKV + 1 TiDB.

| username: xfworld | Original post link

That’s just a verification. If you want to do a POC, you should refer to the configuration requirements provided by the official documentation. Additionally, the other information you mentioned is all described in detail in the official documentation, but there is a lot of it, so it will take quite a bit of time and effort to go through it all.

| username: TiDBer_BVAsMOX5 | Original post link

Thank you for your enthusiastic reply. I have a question. In actual production use, if TiDB data is stored in a three-replica manner, will there be a situation where 1 million rows of experimental data occupy 20GB in the TiDB database file system?

Is this a characteristic of the TiDB community version itself, or is there a configuration error on my part?

Simulating a production environment cluster on a single machine:

Note: My test environment uses a virtual machine with 8 vCPUs, 36GB of memory, and a 320GB SSD.

| username: xfworld | Original post link

  1. MySQL is implemented with a B+ tree, while TiDB is implemented with an LSM tree, which results in significant differences in implementation.
  2. In the pursuit of high write throughput, write amplification is inevitable. This is a known issue, not an unknown one.
  3. If sharding can meet your requirements and you don’t need elastic scaling, you might not need to consider TiDB.
  4. It is recommended to conduct a comprehensive POC before deciding if it meets your needs.
| username: dongb0 | Original post link

I also configured a single machine with 3 replicas of TiKV + 1 TiFlash along with other default settings. After inserting hundreds of thousands of rows of data, the total size of all regions should be around 1-2 GB, and the file size increased by approximately 3-4 GB. However, the total size of the folder naturally increased by more than 50 GB after the cluster was started.

Folder size before deployment:

Deploying the cluster:

After starting the cluster:
50

Inserting test data according to https://docs.pingcap.com/zh/tidb/dev/sql-statement-show-table-regions:

You can see there are about 4 regions, each approximately 100 MB in size. Considering the 3 replicas, the total size should be around 1.2 GB. Now the file system usage has increased by 3 GB:
41

| username: ddhe9527 | Original post link

  1. You can wait for GC to trigger and execute compaction before paying attention to the file size.
  2. You can use the following script to estimate the size of a table in TiDB:
    TiDB 集群管理常见问题 | PingCAP 文档中心
  3. You can control the compression algorithm used by each level of RocksDB by adjusting the following parameters, but it is generally not recommended to adjust them:
    TiKV 配置文件描述 | PingCAP 文档中心
  4. You can see the space amplification rate on the Size amplification graph on the PD → Statistics - Balance page in Grafana.