How is the database collecting and processing statistical information?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 请问下数据库收集统计信息是怎么处理的

| username: TiDBer_Y2d2kiJh

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】v5.4.0
【Reproduction Path】The collection of database statistics is random. Does collecting statistics affect business operations? Can the database collect statistics at a specific time?
【Encountered Issues: Problem Phenomenon and Impact】
【Resource Configuration】Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
【Attachments: Screenshots/Logs/Monitoring】

| username: MrSylar | Original post link

“The database collects statistical information randomly” is incorrect. It is not random; there are parameters that control the collection time window and trigger threshold.
“Does the database affect business operations when collecting statistical information?” It should be said that statistical information affects the execution plan, the execution plan affects SQL execution efficiency, and thus affects business operations.
“Can the database collect statistical information at a specific time?” Yes, it can.
For more details, you can refer to the official documentation: Introduction to Statistics | PingCAP Documentation Center

| username: tidb菜鸟一只 | Original post link

You can adjust these parameters to modify the time and threshold for automatically collecting statistics.

| username: YuchongXU | Original post link

You can set the collection window, which may cause the execution plan to change.

| username: cassblanca | Original post link

You can trigger the job of collecting statistics based on the degree (proportion) of data modification in the table, or you can collect full or incremental statistics on a daily schedule, which can be handled very flexibly. Collecting statistics is a computationally intensive operation, and it is generally recommended to perform statistics collection during system idle windows to avoid collecting during busy periods.

| username: redgame | Original post link

Affects business.

| username: TiDBer_Y2d2kiJh | Original post link

Why is TiDB still collecting statistics even though I set tidb_auto_analyze_end_time to 5 AM yesterday?

| username: MrSylar | Original post link

The difference in time zones, the set value is UTC time +0000, and the dashboard displays +0800.

| username: tidb菜鸟一只 | Original post link

It’s probably a time zone issue.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.