Issues with Inaccurate TiDB SQL Execution Plans

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB SQL执行计划不准确问题

| username: residentevil

[TiDB Usage Environment] Production Environment
[TiDB Version] V7.1.0
[Encountered Problem: Problem Phenomenon and Impact] After just loading data [online data] through DTS, we wanted to start a performance stress test and found that the SQL execution plan was inaccurate, leading to SQL execution times not meeting expectations. After executing analyze table, the execution plan became accurate. What is the situation here? Isn’t TiDB’s statistics collection automatic? I see there are configurations for the analyze rate in TiDB.

| username: tidb菜鸟一只 | Original post link

What are the settings for tidb_auto_analyze_ratio?
What about tidb_auto_analyze_start_time and tidb_auto_analyze_end_time?

| username: residentevil | Original post link

I see that these are all default configurations and haven’t been changed. I understand that the behavior of statistics collection should be automatic, right? Does TiDB have a view to query the last analysis time of a table?

| username: residentevil | Original post link

The sampling rate difference for each table seems quite large, probably less than 50%. As shown above, this is the output of my executed analyze table operation.

| username: h5n1 | Original post link

I see that all the configurations are default and haven’t been changed. I understand that the behavior of collecting statistics should be automatic, right? Does TiDB have a view to query the last analysis time of a table?

show analyze status

| username: residentevil | Original post link

This command represents the operation of manually triggering the analyze table, right? Is there a system view to check for automatically triggered ones?

| username: xfworld | Original post link

If you’re in a hurry, handle it manually.

There is a significant difference in processing efficiency between automatic collection and manual operation.

| username: residentevil | Original post link

It would be much more serious if the issue of inaccurate execution plans appeared online. :sweat_smile:

| username: h5n1 | Original post link

You can only search for keywords in the tidb.log if you can’t see this.

| username: tidb菜鸟一只 | Original post link

Automatically triggered ones are also included in this.

| username: residentevil | Original post link

I’m observing.

| username: xfworld | Original post link

Totally understand :slightly_smiling_face: :slightly_smiling_face: :slightly_smiling_face:

| username: residentevil | Original post link

How can I save only the most recent 30 records? :sweat_smile:

| username: xfworld | Original post link

Is it a slow query? Or something else?

| username: residentevil | Original post link

The output result of show analyze status, I see that the analyze_status view in the information_schema database also outputs a maximum of 30 records :rofl:

| username: h5n1 | Original post link

Try adding “all” at the end.

| username: residentevil | Original post link

There is no such command, I have tested it.

| username: Z六月星星 | Original post link

If TiDB has executed a large number of inserts or updates and the execution plan is incorrect, try running ANALYZE to update the index statistics.

| username: 有猫万事足 | Original post link

You can check the health of the table’s statistics and roughly estimate the accuracy of the statistics on the table using SHOW STATS_HEALTHY. When modify_count >= row_count, the health is 0; when modify_count < row_count, the health is (1 - modify_count / row_count) * 100.

| username: residentevil | Original post link

A health score of 0 means that the statistics may not be accurate, right?