How does TiDB avoid timeout when analyzing large regions?

translator_bot · June 23, 2024, 8:38am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB如何避免大region analyze超时的？

| username: TiDBer_D7483dYr

【TiDB Usage Environment】Production\Test Environment\POC Test
【TiDB Version】Latest
【Encountered Problem】

Looking at the latest code, the RPC timeout for a single region’s analyze request is still tikv.ReadTimeoutMedium (60s).
When a region is very large, such as 10G, 60s may not be enough to complete the run.

【Reproduction Path】What operations were performed to encounter the problem
【Problem Phenomenon and Impact】

translator_bot · June 23, 2024, 8:38am

| username: 啦啦啦啦啦 | Original post link

Regions will automatically split. When the data in a Region exceeds the default Region size limit, it will start to split into 2 Regions.

translator_bot · June 23, 2024, 8:38am

| username: TiDBer_D7483dYr | Original post link

Okay, there might not be any major issues for now. I see that support for Dynamic size region is coming up.

github.com

tikv/rfcs/blob/master/text/0082-dynamic-size-region.md

# Dynamic size region

- RFC PR: https://github.com/tikv/rfcs/pull/0082
- Tracking Issue: https://github.com/tikv/tikv/issues/11515

## Summary

Make the size of a region dynamic, and only set a upper limit.

This is the first step that we try to support PiB scale cluster.

## Motivation

We have observed a lot of regressions when the count of regions increases. The major drawback
comes from three aspects:

1. Transactions need to access more regions, which lead to too many RPCs and have high latency;
2. A node needs to drive more regions, which can have high resource usages and hurt performance;
3. Tools and service like CDC/BR/GC need to loop over all regions, which is slow.

This file has been truncated. show original

The issue of large regions needs to be considered, right?

translator_bot · June 23, 2024, 8:38am

| username: banana_jian | Original post link

It looks like this is already planned.

translator_bot · June 23, 2024, 8:38am

| username: Lily2025 | Original post link

The dynamic size region solution should have considered this issue. When reading, it will be split according to the region-bucket-size (default 96M).

translator_bot · June 23, 2024, 8:38am

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.