The analyze operation on partitioned tables is very slow, can it be optimized?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 分区的表的analyze非常慢,能不能优化下

| username: zhanggame1

If partition tables are used, TiDB will treat each partition as an independent table for analysis. The problem is that the analysis of each partition is particularly slow, as shown in the figure.

The above example has a total of 5 steps:

  1. auto analyze table all columns with 256 buckets, 500 topn, 1 samplerate
  2. merge global stats for z_qianyi.win_ticket’s index ind_win_time
  3. merge global stats for z_qianyi.win_ticket’s index ind_paid_time
  4. merge global stats for z_qianyi.win_ticket’s index primary
  5. merge global stats for z_qianyi.win_ticket columns

The last step is extremely slow, even though this partition only has 40,000 rows of data. See if it can be optimized.

| username: tidb菜鸟一只 | Original post link

To be honest, TiDB’s support for partitioned tables is not particularly good. Try running an analyze on a single partition and see if it’s slow.

| username: zhanggame1 | Original post link

Extremely, extremely slow, just like automatic analysis. With so many partitions, this analysis is not as good as directly analyzing the entire table.

For example, manually testing ANALYZE table z_qianyi.win_ticket PARTITION P359025;
It took 760 seconds to execute.

This partition has a total of 160,000 rows of data, and a full table scan only takes a few seconds.
The win_ticket table has a total of 690 partitions.

| username: heiwandou | Original post link

Please share the screenshots of the start time and end time.

| username: zhanggame1 | Original post link

There are also failures.

| username: 随便改个用户名 | Original post link

In our actual use, we also feel that ANALYZE is relatively slow.

| username: xingzhenxiang | Original post link

You can try adjusting the concurrency parameters, this might speed things up.

| username: WinterLiu | Original post link

I have a question, TiDB is already distributed, is it still necessary to create partitioned tables?

| username: zhanggame1 | Original post link

TiDB partition table is used in just one place, to delete a large amount of data.

| username: zhanggame1 | Original post link

I will try adjusting the partition table parameter tidb_auto_analyze_partition_batch_size from 1 to 100.

| username: 路在何chu | Original post link

Partitioned tables make it convenient to delete data, and this feature is for that purpose.

| username: zhang_2023 | Original post link

It’s also very slow on my end, not sure what’s going on.

| username: zhanggame1 | Original post link

I just tested increasing tidb_auto_analyze_partition_batch_size and it should have a significant effect. I directly changed it from 1 to 100.

According to the analysis, the partition analyze for partitioned tables is very fast when gathering statistics for partitions. For example, with our small data volume, it only takes a few seconds. The slow part is the step “merge global stats for test_vegas2.ticket columns.” Multiple partitions can share this step, so increasing the batch size is very effective, provided that the partition data volume is small.

| username: 有猫万事足 | Original post link

A series of optimizations have been enabled after version 7.4.

Merge global stats 7.6 can be executed asynchronously.
Previously, all partitions had to be fully counted before merging, but now counting and merging can be done simultaneously.

| username: zhanggame1 | Original post link

The step of merging global stats is not only slow but usually results in failure.

| username: TIDB-Learner | Original post link

TiDB’s support for partitioning is not very good.

| username: 哈喽沃德 | Original post link

Try using a single partition. If it’s still slow, then there’s nothing more that can be done.

| username: zhanggame1 | Original post link

I see there is a parameter called tidb_merge_partition_stats_concurrency, which can increase the merge concurrency.

| username: zhanggame1 | Original post link

There is still a problem now, which is that the merge step always reports an error.

| username: zhanggame1 | Original post link

Finally, here are the research results:

The slowness is due to bugs in version 7.5.0 and others. During the merge global stats step, it is not only slow but also likely to fail with the error “illegal bucket order.”
This error is mentioned as fixed in the 7.6.0 release notes: * Fixed an issue where merging global statistics for partitioned tables concurrently might encounter errors #48713 @hawkingrei

After testing with the 7.6.0 version, the merge global stats in version 7.6.0 does not report errors and is more than ten times faster.