PD Monitoring Dashboard

translator_bot · June 23, 2024, 1:41am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: pd监控面板–>Size amplification

| username: Raymond

In the PD monitoring panel, there is a monitoring panel called “Size amplification.” According to the official documentation and the principles of RocksDB, this monitoring panel should reflect the ratio of the actual storage space used to the true size of the data (e.g., the true size of the data is 100MB, but it takes up 200MB of storage space).

Because RocksDB supports MVCC, a single piece of data can have multiple versions, and expired data is not immediately cleaned up. These multiple versions of data also occupy space.
During the compaction process in RocksDB, for example, when two pieces of data from the upper layer are compressed into the next layer, the original two pieces of data can only be deleted after the compaction is completed, which also causes space amplification. This space amplification can be considered as 2.

Therefore, can we judge from the metrics that if the space amplification ratio is very high, it may indicate that old version data has not been cleaned up in time, which could mean that the GC time is set too long or the GC is not effective? Is this conclusion correct?

However, the question is why the PromSQL formula is written this way? Why does it need to be multiplied by 2^20?

translator_bot · June 23, 2024, 1:41am

| username: Raymond | Original post link

But here the official documentation says it is the compression ratio.

The official documentation here says it is the space amplification ratio. What exactly is Size amplification?

Now I’m a bit confused.

translator_bot · June 23, 2024, 1:41am

| username: Raymond | Original post link

Actually, this is the cluster compression ratio (because this value is calculated based on region_size/store_used). The reason it needs to be multiplied by 2^20 is that the unit of region_size is MB, while the unit of store_used is bytes.
Actually, this can also be considered as TiKV’s space amplification. However, region_size is before compression, while store_used is after compression. This compression ratio algorithm is not very rigorous.

translator_bot · June 23, 2024, 1:41am

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.