Check table data distribution: Use the EXPLAIN command to view the execution plan behind ANALYZE TABLE and check if the data distribution is even.
Adjust parallelism: If the parallelism is set too high, try lowering it to reduce memory consumption.
Check TiDB configuration: Review the settings of parameters such as tidb_analyze_auto_threshold and tidb_mem_quota_query, and adjust them according to the actual situation.
Monitor resource usage: While executing ANALYZE TABLE, monitor the system’s resource usage, including CPU, memory, and disk I/O, to help identify bottlenecks.
Use TiDB’s monitoring tools to track resource usage during the ANALYZE TABLE operation to determine if there are excessive concurrent operations or configuration issues.
Check TiDB configuration: Review the settings of parameters like tidb_analyze_auto_threshold and tidb_mem_quota_query, and adjust them according to the actual situation.
There is also a parameter tidb_mem_quota_analyze=-1 by default. If tidb_mem_quota_query=3G at this time, is the memory controlled by tidb_mem_quota_query when executing ANALYZE TABLE?
Is it determined by the smaller value of these two parameters whether ANALYZE will be terminated?
Currently, TiDB supports two types of statistics: tidb_analyze_version = 1 and tidb_analyze_version = 2. In versions v5.3.0 and later, the default value of this variable is 2. If you upgrade a cluster from a version prior to v5.3.0 to v5.3.0 or later, the default value of tidb_analyze_version does not change.
Version 2 statistics avoid the significant errors that may occur in large datasets due to hash collisions in Version 1, while maintaining estimation accuracy in most scenarios.
Adjust memory limit: In TiDB’s configuration file, adjust the performance.max-memory option to limit the maximum amount of memory that a single query can use.
The good strategy is probably not out yet.
One is very fast, definitely requires a lot of resources.
One can be slower, requiring relatively fewer resources.
One is balanced, with resource requirements balanced according to the actual situation.
You can limit the time for automatic analyze to off-peak hours to prevent auto analyze from affecting business operations. At the same time, you can also reduce the amount of memory that can be used. The parameters related to analyze performance are:
In the 6.5 series, if the configuration is not reasonable, there may be a risk of high memory usage during analyze. Additionally, the characteristics of the table itself can affect the memory usage of analyze, especially for large wide tables and partitioned tables. In this regard, optimizations have been made in version 7.5 and above, which you can try using. Additionally, tidb_enable_historical_stats might cause OOM; this has been disabled by default in the latest version, so you might consider turning it off directly.
Provide the table structure.
Check if the default configuration values for analyze have been changed.