TiKV Region Size Suddenly Increases and Remains High

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tikv region size突然上涨,居高不下

| username: TiDBer_27OdodiJ

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.4.0
[Encountered Problem] TiKV approximate region size is more than 1GB continuous alert
[Reproduction Path] Not yet
[Problem Phenomenon and Impact]
Phenomenon: Continuous alert, the data on the [approximate region size] panel in Grafana keeps rising
Impact: None for now

Cluster Purpose: [Juicefs Metadata Service]
This alert has been ongoing for more than a day, the region size is growing particularly fast, the current cluster mainly consists of TiKV components, and a TiDB is deployed for GC, which is currently working normally.
Apart from this alert, there are no other anomalies in the cluster.

[Attachments]
Tiup cluster Display information:

Tiup Cluster Edit Config information:

[approximate region size] panel monitoring:

TiKV part of the logs:

Information of one of the regions:
CF default is very large, and mvcc num_rows is particularly high

GC configuration is as follows:

TiKV GC panel

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

| username: TiDBer_jYQINSnf | Original post link

The log states that the split failed because the specified split key does not belong to the current region. So, what did you do regarding the split?

| username: TiDBer_27OdodiJ | Original post link

Nothing has been done. This cluster has always been used for Juicefs and has not been manually intervened.

| username: TiDBer_27OdodiJ | Original post link

The increase today is particularly significant.

| username: wuxiangdong | Original post link

Is it a scheduled task for batch writing of big data?

| username: wuxiangdong | Original post link

You can increase the batch-split-limit to speed up the splitting process.

| username: wisdom | Original post link

Is there a batch task to process data?

| username: zhouzeru | Original post link

Split failed, the specified split key does not belong to the current region.

| username: TiDBer_27OdodiJ | Original post link

Currently, the cluster region size has automatically returned to normal levels. This sudden phenomenon is not certain to be caused by a large volume of data writes. Could you please advise if there is any monitoring metric on the TiKV side that can determine whether it was caused by a large volume of writes?

| username: Raymond | Original post link

You can check the hotspot map on the dashboard to see if there are any write hotspots.

| username: WalterWj | Original post link

Use pd ctl to manually split the relevant region.

operator add split-region 1 --policy=approximate // Split Region 1 into two regions approximately
operator add split-region 1 --policy=scan // Split Region 1 into two regions based on an accurate scan