Under the same zstd compression format and the same amount of data, why is the compression size of TiDB nearly twice the size of other zstd format databases when TiFlash is configured?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 相同的zstd 的压缩格式下 + 相同的数据量,在设置了TIFLASH的情况下,为什么TIDB的压缩大小是其他zstd格式的库接近两倍的大小?

| username: 卡卡其其

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.0
[Reproduction Path]

  1. Import data into TiDB using zstd compression format, 600 million records close to 600GB, the table is set with TiFlash.
    Removing TiFlash results in the same size, no change.

  2. What is the command to check the total size of a table in TiKV and TiFlash? If I want to estimate the size of the table in TiKV, can I roughly calculate the size of TiKV and TiFlash separately by dividing the total size by 2?
    [Encountered Problem: Problem Phenomenon and Impact]
    [Resource Configuration]
    [Attachment: Screenshot/Log/Monitoring]

[Used Command]
select ‘tidb_ct’, count(1) as data
from ods_global.ods_g3_dbo_subfhd
union all
select ‘ods_g3_dbo_subfhd’ as tidb_ct, concat(round(sum(data_length / 1024 / 1024 / 1024), 2), ‘GB’) as data
from information_schema.tables
where table_schema = ‘ods_global’
and TABLE_NAME = ‘ods_g3_dbo_subfhd’;

| username: tidb菜鸟一只 | Original post link

TiKV has replicas, how many replicas did you set?