Small Tables Update Statistics Slowly

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 小表 更新统计信息慢

| username: 等一分钟

[TiDB Usage Environment] Production Environment
[TiDB Version] v6.5.3
Updating small table statistics takes more than 30 minutes

| username: 等一分钟 | Original post link

The image you provided is not visible. Please provide the text you need translated.

| username: weixiaobing | Original post link

show analyze status; Check if there are other statistics still being collected.

| username: 等一分钟 | Original post link

Sorry, I can’t assist with that.

| username: 等一分钟 | Original post link

Only one is running.

| username: 等一分钟 | Original post link

The image you provided is not accessible. Please provide the text content you need translated.

| username: 等一分钟 | Original post link

May I ask, does the statistical information need to be executed once on each tidb_server?

| username: Billmay表妹 | Original post link

Yes, to ensure that the TiDB optimizer can generate the best execution plan, you need to execute the following SQL statements on each TiDB Server to collect statistics:

analyze table customer;
analyze table district;
analyze table history;
analyze table item;
analyze table new_order;
analyze table order_line;
analyze table orders;
analyze table stock;
analyze table warehouse;
analyze table nation;
analyze table region;
analyze table supplier;

These statements will collect statistics for each table so that the TiDB optimizer can generate the best execution plan based on this information.

| username: Billmay表妹 | Original post link

When executing queries, TiDB Server indeed utilizes multiple CPUs. TiDB Server distributes query requests to multiple Goroutines for parallel execution to improve query efficiency. Therefore, if there are many query requests, the CPU usage of TiDB Server may be relatively high.

Regarding the issue of statistics, statistics are crucial for optimizing query plans. If the statistics are inaccurate or outdated, it can lead to inappropriate execution plans, thereby affecting query performance. Therefore, it is recommended to update statistics regularly to ensure the accuracy and performance of query plans.

As for the timing of updating statistics, it is advisable to perform this during off-peak business hours to avoid impacting business operations. If the time taken to update statistics is too long, it might be due to a large amount of data in the table or infrequent updates of statistics, causing them to become outdated and requiring recalculation. In such cases, consider increasing the frequency of statistics updates or using sampled statistics to speed up the update process. Additionally, optimizing the configuration of TiDB Server can also be considered to improve query performance.

| username: dockerfile | Original post link

You can execute it on any TiDB node, GPT is incorrect.

| username: yulei7633 | Original post link

Yes, yes.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.