Error "Region Unavailable" When Inserting Large Amounts of Data into a Table

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 往一张表大量插入数据的时候报region不可用

| username: cy6301567

When inserting a large amount of data into a table, an error indicating that the region is unavailable is reported. What is the usual cause of this situation?
Cause: java.sql.SQLException: Region is unavailable; uncategorized SQLException; SQL state [HY000]; error code [9005]; Region is unavailable; nested exception is java.sql.SQLException: Region is unavailable

| username: tidb菜鸟一只 | Original post link

Take a look at this.

| username: TiDB_C罗 | Original post link

Combine with monitoring to see what the issue is.

| username: zhanggame1 | Original post link

Check the TiDB error logs, there should be records. With just this information, it’s not possible to determine the issue.

| username: YuchongXU | Original post link

Please also send the related TiKV logs.

| username: redgame | Original post link

If the TiDB cluster is overloaded, it may be unable to handle new requests, resulting in a “Region is unavailable” error.

| username: ljluestc | Original post link

TiKV Node Failure: TiDB database uses TiKV as the storage engine. If a TiKV node fails or becomes inaccessible, the corresponding Region may become unavailable. You can check the status of the TiKV nodes to ensure all nodes are up and running, and review the TiKV logs for more detailed information.

Data Skew: When inserting data, if a certain Region is overloaded, it may cause that Region to become unavailable. This could be due to data skew, where a portion of the data is concentrated in a specific Region, leading to overload. Data skew issues can be resolved by proper sharding and adjusting the data distribution.

Raft Leader Switch: TiKV uses the Raft algorithm to achieve data replication and high availability. If a Raft Leader switch occurs during data insertion, it may cause the current Leader to become unavailable, resulting in a “Region is unavailable” error. This is usually a normal cluster maintenance operation but may affect the current insertion operation. Typically, waiting for a while will allow the TiKV cluster to automatically return to normal.

| username: cy6301567 | Original post link

Very comprehensive.

| username: cy6301567 | Original post link

Yes, it was caused by a TiKV node failure, due to accidentally joining two large tables with a non-unique key.

| username: h5n1 | Original post link

Did TiKV run out of memory? How much memory do you have, and what is your block-cache setting?

| username: cy6301567 | Original post link

Two large tables without unique keys caused an OOM restart.

| username: cy6301567 | Original post link

Yes, it was restarted due to OOM.

| username: jansu-dev | Original post link

Could you send a copy of the tikv-details metrics? PingCAP MetricsTool

| username: ShawnYan | Original post link

Advanced, got a new tool link again, thanks.