KVClient Batch Write to TiKV Causes Region Abnormality After Region Split

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: KVClient批写入tikv至region拆分后region异常

| username: TiDBer_Jxly6caL

[TiDB Usage Environment] Test
[TiDB Version]
[Reproduction Path]
KVClient batch writes to TiKV, after exceeding 960,000 entries, the region splits. Subsequently, any query or write to TiKV reports Region invalid: null.

[Encountered Issue: Phenomenon and Impact]
TiKV queries or inserts encounter the following error:
Exception in thread “main” org.tikv.common.exception.GrpcException: retry is exhausted.
at org.tikv.common.util.ConcreteBackOffer.logThrowError(ConcreteBackOffer.java:266)
at org.tikv.common.util.ConcreteBackOffer.doBackOffWithMaxSleep(ConcreteBackOffer.java:251)
at org.tikv.common.util.ConcreteBackOffer.doBackOff(ConcreteBackOffer.java:190)
at org.tikv.raw.RawKVClient.put(RawKVClient.java:164)
at org.tikv.raw.RawKVClient.put(RawKVClient.java:142)
at com.ebupt.test.tikv.TiKV.main(TiKV.java:71)
Caused by: org.tikv.common.exception.TiClientInternalException: Region invalid: null
at org.tikv.common.region.RegionManager.getRegionStorePairByKey(RegionManager.java:188)
at org.tikv.common.region.RegionStoreClient$RegionStoreClientBuilder.build(RegionStoreClient.java:1438)
at org.tikv.common.region.RegionStoreClient$RegionStoreClientBuilder.build(RegionStoreClient.java:1428)
at org.tikv.raw.RawKVClient.put(RawKVClient.java:158)

[Resource Configuration]
Three machines with 16GB RAM / 8 cores, deploying TiKV and PD

[Attachments: Screenshots/Logs/Monitoring]
After batch writing exceeds the region split boundary, the TiKV logs are as follows:
[2023/04/07 14:03:30.645 +08:00] [INFO] [apply.rs:1612] [“execute admin command”] [command=“cmd_type: BatchSplit splits { requests { split_key: 696E6F646573000000000009BBDD new_region_id: 8 new_peer_ids: 9 new_peer_ids: 10 new_peer_ids: 11 } right_derive: true }”] [index=1981] [term=6] [peer_id=6] [region_id=2]
[2023/04/07 14:03:30.645 +08:00] [INFO] [apply.rs:2470] [“split region”] [keys=“key 696E6F646573000000000009BBDD”] [region=“id: 2 region_epoch { conf_ver: 5 version: 1 } peers { id: 3 store_id: 1 } peers { id: 6 store_id: 4 } peers { id: 7 store_id: 5 }”] [peer_id=6] [region_id=2]
[2023/04/07 14:03:30.647 +08:00] [INFO] [peer.rs:3890] [“moving 0 locks to new regions”] [region_id=2]
[2023/04/07 14:03:30.647 +08:00] [INFO] [peer.rs:3985] [“insert new region”] [region=“id: 8 end_key: 696E6F646573000000000009BBDD region_epoch { conf_ver: 5 version: 2 } peers { id: 9 store_id: 1 } peers { id: 10 store_id: 4 } peers { id: 11 store_id: 5 }”] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [peer.rs:262] [“create peer”] [peer_id=10] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [raft.rs:2646] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {9, 10, 11} }, outgoing: Configuration { voters: {} } }, learners: {}, learners_next: {}, auto_leave: false }”] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [raft.rs:1120] [“became follower at term 5”] [term=5] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [raft.rs:384] [newRaft] [peers=“Configuration { incoming: Configuration { voters: {9, 10, 11} }, outgoing: Configuration { voters: {} } }”] [“last term”=5] [“last index”=5] [applied=5] [commit=5] [term=5] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [raw_node.rs:315] [“RawNode created with id 10.”] [id=10] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.647 +08:00] [INFO] [raft.rs:1565] [“[logterm: 5, index: 5, vote: 0] cast vote for 9 [logterm: 5, index: 5] at term 5”] [“msg type”=MsgRequestPreVote] [term=5] [msg_index=5] [msg_term=5] [from=9] [vote=0] [log_index=5] [log_term=5] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.656 +08:00] [INFO] [raft.rs:1364] [“received a message with higher term from 9”] [“msg type”=MsgRequestVote] [message_term=6] [term=5] [from=9] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.656 +08:00] [INFO] [raft.rs:1120] [“became follower at term 6”] [term=6] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.656 +08:00] [INFO] [raft.rs:1565] [“[logterm: 5, index: 5, vote: 0] cast vote for 9 [logterm: 5, index: 5] at term 6”] [“msg type”=MsgRequestVote] [term=6] [msg_index=5] [msg_term=5] [from=9] [vote=0] [log_index=5] [log_term=5] [raft_id=10] [region_id=8]
[2023/04/07 14:03:30.658 +08:00] [INFO] [subscription_manager.rs:395] [“backup stream: on_modify_observe”] [op=“Stop { region: id: 8 end_key: 696E6F646573000000000009BBDD region_epoch { conf_ver: 5 version: 2 } peers { id: 9 store_id: 1 } peers { id: 10 store_id: 4 } peers { id: 11 store_id: 5 } }”]
[2023/04/07 14:03:30.658 +08:00] [WARN] [subscription_track.rs:143] [“trying to deregister region not registered”] [region_id=8]

| username: xfworld | Original post link

This error may not necessarily affect the cluster; it is possible that the region has been split into new ones.

After the region’s information changes, the old region will definitely be untraceable.

It is recommended to continue testing.

| username: TiDBer_Jxly6caL | Original post link

The region has already been split, and I see that the information on PD has been updated (/pd/api/v1/regions), but any read or write operations on the TiKV cluster will report the above error: Region invalid: null.

Tracking the debug information for insertion, at the step of obtaining the region, it can locate the corresponding region based on the start and end key values and also get the leader’s store_id, but the final Region creation is null, leading to the above error.

| username: xfworld | Original post link

The creation and scheduling of regions are done by PD, not the client…

| username: TiDBer_Jxly6caL | Original post link

Sorry, there was an issue with my previous statement regarding the creation of the RegionStoreClient object.

| username: xfworld | Original post link

There is a misconception here:

  1. An error does not necessarily mean there is an issue with the TiDB cluster. It is best to check the service status through the dashboard, Prometheus, and logs.
  2. Does this error continue to recur?
| username: TiDBer_Jxly6caL | Original post link

  1. Currently, the cluster does not have TiDB installed, and I see that the dashboard requires a SQL account, so I can’t log in. Which metrics on Prometheus can help identify this issue?
  2. Reproduction: After clearing the cluster data, the issue occurs when I reinsert 1 million rows of data. If I only insert 900,000 rows and do not trigger region splitting, it does not affect usage. Initially, I thought that after the region split, it would take some time for the data to be redistributed, but even after leaving it for a day, the query remains in this state. It seems that after the split, the information in the region cannot be queried. When querying a key that was successfully inserted before, the final result returned by this piece of source code is still null:
    Pair<Metapb.Region, Metapb.Peer> regionAndLeader = pdClient.getRegionByKey(backOffer, key);
    region = cache.putRegion(createRegion(regionAndLeader.first, regionAndLeader.second, backOffer));
| username: wuxiangdong | Original post link

The region split was not completed, causing the insertion to fail.

| username: TiDBer_Jxly6caL | Original post link

Is there any way to check if the region has completed splitting? The last log at that time indicated this, and there have been no related logs since then.

| username: xfworld | Original post link

Prometheus and Grafana can be used to check the changes in the number of PD and regions.

If 1 million records have already been inserted, and you continue to insert without clearing them, is it still readable and writable?
If a backoff occurs, it indicates that the region split has not yet completed…

| username: TiDBer_Jxly6caL | Original post link

After the number of regions and leaders on ganafa has become 2, the three RocksDB CPUs also completed read and write operations within a minute. After the region became 2, no operations could be performed, including read, insert, and delete. Does backoff refer to the logs? It seems that there is no similar information appearing.

| username: TiDBer_Jxly6caL | Original post link

The batch write uses RawKVClient, which is non-transactional. Will there be any impact if the region splits during the insertion process?

| username: TiDBer_CkEj07nA | Original post link

I encountered this problem here as well. Have you resolved it on your end?

| username: TiDBer_Jxly6caL | Original post link

No, I later tried pre-splitting the region and encountered new problems.

| username: TiDBer_Jxly6caL | Original post link

Due to an issue with the TiKV client query, you need to add the following setting when using RawKVClient for queries:
conf.setKvMode(String.valueOf(TiConfiguration.KVMode.RAW));
Refer to (rfcs/text/0069-api-v2.md at master · tikv/rfcs · GitHub)