Statistics collection and monitoring

Hi!

I have a few question related to statistics collection and monitoring. I have searched the official documentation, but been unable to find answers to the questions below.

I would appreciate some input.

Env: TiDB 6.5.7 running on kubernetes - Production

Problem:
We had an incident where the statistics were updated and performance decreased significantly for a table.

Solution:
Manually ran Analyze on the table. Performance was restored.

Questions:

  1. How is it that performance was reduced suddenly when the statistics were updated. Should not the automatic statistics collection come to the same conclusion as the manual Analyze that was run 20 minutes later?
  2. Is it possible that the statistics were lost/dropped automatically? Bug?
  3. I can not find any log files for the collection and update of statistics. Do these exist? Log level?
  4. Since I ran a manual Analyze, the meta data for the statistics in the information tables were updated and I was unable to see how the statistics looked in the previous (erroneous) version. Is there a way to see the “history” of the statistics?
  5. Is there a way to monitor big changes in the statistics for a table so you can set alerts on suspicious/big changes? metrics?

Best Regards,
Fredrik

Hi Fredrik,

I can help you with your questions related to statistics collection and monitoring in TiDB. Let’s address each of your questions step by step:

  1. When the statistics were updated and performance decreased significantly for a table, it is possible that the automatic statistics collection did not trigger the same conclusion as the manual Analyze that was run later. In TiDB, the automatic update of statistics is based on certain conditions like the ratio of modified rows to total rows and the time window for analysis . It’s possible that the automatic update did not capture the changes effectively, leading to decreased performance. Running a manual Analyze allowed you to update the statistics more accurately, hence restoring performance.

  2. It is unlikely that the statistics were lost or dropped automatically, as there is no documented behavior in TiDB that would cause statistics to be lost or dropped without manual intervention. It’s more probable that the automatic statistics collection did not capture the changes effectively, as mentioned earlier.

  3. In TiDB, there are no specific log files dedicated to the collection and update of statistics mentioned in the official documentation. The logging level for TiDB can be configured, but detailed logs specifically for statistics collection may not be available .

  4. Unfortunately, there is no built-in feature in TiDB to view the history of statistics for a table. Once you run a manual Analyze and update the metadata, the previous statistics are overwritten. If you need to track changes in statistics over time, you may need to consider external monitoring or logging solutions to capture and store this information.

  5. TiDB does not provide a built-in mechanism to monitor big changes in statistics for a table and set alerts on suspicious or significant changes. However, you can implement custom monitoring solutions using TiDB’s metrics and monitoring capabilities. By utilizing TiDB’s metrics and integrating with monitoring tools, you can set up alerts based on specific thresholds or changes in statistics for tables .

I hope this helps clarify your questions regarding statistics collection and monitoring in TiDB. If you have any more questions or need further assistance, feel free to ask!

Best Regards,
[Assistant’s Name]