TiKV Region Size Suddenly Increases and Remains High

translator_bot · June 23, 2024, 2:12am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Tikv region size突然上涨，居高不下

| username: TiDBer_27OdodiJ

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.4.0
[Encountered Problem] TiKV approximate region size is more than 1GB continuous alert
[Reproduction Path] Not yet
[Problem Phenomenon and Impact]
Phenomenon: Continuous alert, the data on the [approximate region size] panel in Grafana keeps rising
Impact: None for now

Cluster Purpose: [Juicefs Metadata Service]
This alert has been ongoing for more than a day, the region size is growing particularly fast, the current cluster mainly consists of TiKV components, and a TiDB is deployed for GC, which is currently working normally.
Apart from this alert, there are no other anomalies in the cluster.

[Attachments]
Tiup cluster Display information:

Tiup Cluster Edit Config information:

[approximate region size] panel monitoring:

TiKV part of the logs:

Information of one of the regions:
CF default is very large, and mvcc num_rows is particularly high

GC configuration is as follows:

TiKV GC panel

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

translator_bot · June 23, 2024, 2:12am

| username: TiDBer_jYQINSnf | Original post link

The log states that the split failed because the specified split key does not belong to the current region. So, what did you do regarding the split?

translator_bot · June 23, 2024, 2:12am

| username: TiDBer_27OdodiJ | Original post link

Nothing has been done. This cluster has always been used for Juicefs and has not been manually intervened.

translator_bot · June 23, 2024, 2:12am

| username: TiDBer_27OdodiJ | Original post link

The increase today is particularly significant.

translator_bot · June 23, 2024, 2:12am

| username: wuxiangdong | Original post link

Is it a scheduled task for batch writing of big data?

translator_bot · June 23, 2024, 2:12am

| username: wuxiangdong | Original post link

You can increase the batch-split-limit to speed up the splitting process.

translator_bot · June 23, 2024, 2:12am

| username: wisdom | Original post link

Is there a batch task to process data?

translator_bot · June 23, 2024, 2:12am

| username: zhouzeru | Original post link

Split failed, the specified split key does not belong to the current region.

translator_bot · June 23, 2024, 2:12am

| username: TiDBer_27OdodiJ | Original post link

Currently, the cluster region size has automatically returned to normal levels. This sudden phenomenon is not certain to be caused by a large volume of data writes. Could you please advise if there is any monitoring metric on the TiKV side that can determine whether it was caused by a large volume of writes?

translator_bot · June 23, 2024, 2:12am

| username: Raymond | Original post link

You can check the hotspot map on the dashboard to see if there are any write hotspots.

translator_bot · June 23, 2024, 2:12am

| username: WalterWj | Original post link

Use pd ctl to manually split the relevant region.

operator add split-region 1 --policy=approximate // Split Region 1 into two regions approximately
operator add split-region 1 --policy=scan // Split Region 1 into two regions based on an accurate scan