Questions about TIKV_REGION_STATUS, TIKV_REGION_PEERS, and the current data volume in the tidb-acloud-overview monitoring

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于TIKV_REGION_STATUS、TIKV_REGION_PEERS及监控中tidb-acloud-overview的当前数据量疑问

| username: wenyi

Question 1:
What is the relationship between the number of regions in tikv_region_peers and tikv_region_status?
The documentation says that tikv_region_status contains the basic information of TiKV regions. I understand this to mean the total number of regions in the cluster?
What does the number of regions in the tikv_region_peers table represent?

Question 2:
The current data volume in the monitoring is 1T.
When I calculate the region size in tikv_region_status and compare it with the current data volume in the monitoring, the difference is too large.
How can I determine the total number of regions in the cluster? How can I query this?

| username: Billmay表妹 | Original post link

The roles and meanings of these two tables:

  • tikv_region_status table: This table provides basic information about each Region in the TiKV cluster, including Region ID, start key, end key, table ID, database name, table name, etc. [1]. By querying this table, you can obtain detailed information about all Regions in the TiKV cluster.

  • tikv_region_peers table: This table provides information about each Region’s Peers (replicas) in the TiKV cluster, including Peer ID, Store ID, whether it is a Learner or Leader, etc. [2]. By querying this table, you can understand the distribution of Peers for each Region.

The number of Regions in these two tables represents different meanings:

  • The number of Regions in the tikv_region_status table represents the total number of Regions in the entire TiKV cluster. Each Region corresponds to a portion of the data in the TiKV cluster.

  • The number of Regions in the tikv_region_peers table represents the number of Regions on each TiKV node. Each TiKV node is responsible for storing and processing a portion of the Region’s data and maintaining the Peer information of these Regions.

Therefore, the number of Regions in the tikv_region_status table reflects the overall situation of the entire TiKV cluster, while the number of Regions in the tikv_region_peers table reflects the local situation on each TiKV node.

[2]: TiDB Official Documentation - INFORMATION_SCHEMA.TIKV_REGION_PEERS: TIKV_REGION_PEERS | PingCAP 文档中心
[1]: TiDB Official Documentation - INFORMATION_SCHEMA.TIKV_REGION_STATUS: TIKV_REGION_STATUS | PingCAP 文档中心

| username: wenyi | Original post link

  • The number of Regions in the tikv_region_status table represents the total number of Regions in the entire TiKV cluster. Each Region corresponds to a portion of the data in the TiKV cluster.
  • The number of Regions in the tikv_region_peers table represents the number of Regions on each TiKV node. Each TiKV node is responsible for storing and processing a portion of the Region’s data and maintaining the Peer information of these Regions.

Why aren’t they in a 3x relationship?

| username: wenyi | Original post link

To clarify, this database is static data with no data modifications. It’s a test cluster that I am playing with.

| username: 小龙虾爱大龙虾 | Original post link

Question 1:

  1. The tikv_region_peers table records peer information. Normally, if a region is set to have 3 replicas and all regions are healthy without TiFlash, the number of rows in this table should be three times the number of regions.
  2. The tikv_region_status table records region information, but since it does not record one row per region, its row count does not equal the number of regions. Only count(distinct region_id) equals the number of regions.

Question 2:

  1. The current storage size monitoring comes from PD monitoring, which is reported by TiKV nodes. It should represent the physical data size of the node.
  2. Your method of calculating size is incorrect. Firstly, the number of regions won’t be as mentioned above. Secondly, not all regions are 96MB.
  3. The number of regions in the cluster can be directly seen through monitoring, in the PD panel => number of regions.
| username: wenyi | Original post link

The calculated result also differs greatly, not every region is 96M, I understand, but the result differs too much.

| username: zhanggame1 | Original post link

The size should be checked using the APPROXIMATE_SIZE field, not directly calculated as 96M.

TiKV data is compressed. APPROXIMATE_SIZE is the compressed size. 3 * APPROXIMATE_SIZE / compression ratio is the total disk usage of TiKV. Refer to this for the compression ratio:

| username: 小龙虾爱大龙虾 | Original post link

There’s a big difference, your calculation formula is incorrect.

| username: zxgaa | Original post link

The first one is the number of replicas, and the second one is the number of regions. These are different concepts.

| username: dba远航 | Original post link

From the picture, it shows that you have three replicas. If you know how it is calculated, you will understand their differences.

| username: wluckdog | Original post link

This table lookup size is also inaccurate.

SELECT
    db_name,
    table_name,
    ROUND(SUM(total_size / cnt), 2)/1024 Approximate_Size,
    ROUND(SUM(total_size / cnt / (SELECT
                    ROUND(AVG(value), 2)
                FROM
                    METRICS_SCHEMA.store_size_amplification
                WHERE
                    value > 0)),
            2)/1024 Disk_Size
FROM
    (SELECT
        db_name,
            table_name,
            region_id,
            SUM(Approximate_Size) total_size,
            COUNT(*) cnt
    FROM
        information_schema.TIKV_REGION_STATUS
    WHERE
        db_name = 'db_name'
        --    AND table_name IN ('t_name')
    GROUP BY db_name , table_name , region_id) tabinfo
GROUP BY db_name , table_name;

After this query, the total disk space size doesn’t match.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.