Questions about TiDB data compression?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于TiDB数据压缩的疑问?

| username: 春风十里

[TiDB Usage Environment] Testing/PoC
[TiDB Version] v7.5.0
[Reproduction Path] Check the size of a single table in the TiDB database
[Encountered Problem: Import a single table through sysbench to verify the actual storage space usage of TiDB. I read in the documentation that TiDB has a compression ratio, but in testing a single table, I found that compared to MySQL, the data was not compressed (the size of the single table in TiDB is 597M, while the single table in MySQL is 459M). Is this normal? Can someone clarify?

Problem Reproduction:

  1. Import a table with 2 million records into TiDB V7.5 and MySQL V5.12 using sysbench
    sysbench --db-driver=mysql
    oltp_read_write prepare

  2. Calculate the size respectively
    The size of the single table in TiDB is 597M, and the size of the single table in MySQL is 459M.
    Screenshot of TiDB table size

Screenshot of MySQL table size

| username: 春风十里 | Original post link

I have another question. In the information_schema.tables view, it states:

  • DATA_LENGTH: Data length. Data length = number of rows in the statistics × sum of the storage lengths of each column in the tuple. This does not yet consider the number of replicas in TiKV.

If there are three replicas, is the actual size occupied by the table equal to DATA_LENGTH multiplied by 3?

I found a script online, and it calculated the disk space for a 597M table to be 1243M. The result doesn’t seem to be DATA_LENGTH × 3. So, how can I accurately calculate the actual disk space used by the table?

| username: TiDBer_小阿飞 | Original post link

TiDB’s TiKV nodes use RocksDB, and the default compression algorithm for RocksDB is: [no:no:lz4:lz4:lz4:zstd:zstd]
The compression for RocksDB’s 6 levels is: [no:no:lz4:lz4:lz4:zstd:zstd]

This means that level0 and level1 are not compressed;

level2 to level4 use the lz4 compression algorithm;

level5 and level6 use the zstd compression algorithm.

‘no’ means no compression, ‘lz4’ is a balanced compression algorithm in terms of speed and compression ratio, ‘zlib’ has a high compression ratio and is storage-friendly, but the compression speed is relatively slow and requires more CPU resources during compression.

Original link: TiDB的tikv节点的压缩算法_tidb 压缩算法-CSDN博客

| username: zhanggame1 | Original post link

The store_size_amplification represents the average compression ratio of the cluster. Besides using the SELECT * FROM METRICS_SCHEMA.store_size_amplification; statement to query, you can also check the Size amplification metric for each node under the PD - statistics balance panel in Grafana monitoring to obtain this information. The average compression ratio of the cluster is the average of the Size amplification values of all nodes.

Approximate_Size represents the size of a single replica of the table before compression. This value is an estimate and not an exact value.

Disk_Size represents the size of the table after compression. It can be estimated based on Approximate_Size and store_size_amplification.

| username: 春风十里 | Original post link

Because my test environment has a small amount of data, it hasn’t reached the compression level yet?

| username: andone | Original post link

It feels like it hasn’t reached a certain threshold.

| username: 小龙虾爱大龙虾 | Original post link

The data_length in information_schema.tables is obtained based on statistics, right? To truly test the compression rate, it is recommended to use a larger dataset and finally compare the file system usage. This way, it will be more accurate.

| username: 春风十里 | Original post link

After reading the official documentation, it seems that the issue is related to the small data volume. When the data size of level1 reaches the value specified by max-bytes-for-level-base, it triggers compaction between the sst of level1 and the overlapping sst of level2.

Golden Rule: The primary reference for setting max-bytes-for-level-base is to ensure it is roughly equal to the data volume of level0. This helps reduce unnecessary compactions. For example, if the compression method is “no:no:lz4:lz4:lz4:lz4:lz4”, then the value of max-bytes-for-level-base should be the size of write-buffer-size multiplied by 4, because neither level0 nor level1 is compressed, and the condition for level0 to trigger compaction is when the number of sst files reaches 4 (default value). If both level0 and level1 are compressed, you need to analyze the RocksDB logs to see the approximate size of an sst file compressed from a memtable. For example, if it is 32MB, then the recommended value for max-bytes-for-level-base should be 32MB * 4 = 128MB.

max-bytes-for-level-base = “512MB”

I’ll try importing the test environment data another day.

| username: 江湖故人 | Original post link

The oltp_read_write includes operations like delete, so testing with insert is relatively fair, and it should involve as many rows as possible. :thinking:

| username: 春风十里 | Original post link

I just prepared the data, haven’t tested the performance yet.

| username: tidb菜鸟一只 | Original post link

After compression, it’s about 1/3, and then the three replicas are roughly the same size… The size comparison of the data prepared by sysbench for MySQL and TiDB is like this.

| username: dba远航 | Original post link

Check the default compression algorithm and compression level.