Statistics Lost After TiDB Machine Reboot in Production Environment

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 生产环境tidb机器重启后,统计信息丢失

| username: cy6301567

In the production environment, TiDB nodes restarted because memory was exhausted, but each time the statistics are not automatically loaded and need to be executed manually. Which configuration issue is causing this, and does the automatic restart result in the loss of statistical information?

| username: Miracle | Original post link

I didn’t quite understand. How did you figure out that the statistics were lost?

| username: cy6301567 | Original post link

After the TiDB node restarts, we encounter issues with index selection during SQL execution. We need to refresh the statistics for it to return to normal.

| username: tidb菜鸟一只 | Original post link

It should be that the statistical information cached by tidb-server became inconsistent due to a restart. Sometimes, multiple tidb-servers may execute the same SQL with different execution plans because of inconsistent cached information.

| username: cy6301567 | Original post link

We only have one TiDB node. After this node automatically restarts due to memory overflow, it needs to be fully manually refreshed once to return to normal; otherwise, many SQL queries are very slow. I understand that the restart should reload the statistical information into memory, right?

| username: 昵称想不起来了 | Original post link

Hmm, could it be that automatic updates are turned off? 常规统计信息 | PingCAP 文档中心
Or is the time interval set too long?

| username: cy6301567 | Original post link

As long as the tidb-service node does not restart, everything is normal. It will automatically update the statistics.

| username: 有猫万事足 | Original post link

Like this issue.

| username: tidb菜鸟一只 | Original post link

How should I put it, since tidb-server is a stateless node, the actual statistics of the TiDB cluster are stored in TiKV. If tidb-server reloads all the statistics every time it restarts, the restart time for TiDB might be very long, and during this period, your tidb-server won’t be able to provide services externally. If you need to adjust the configuration for loading statistics, take a look at this. You can try adjusting it:

| username: Fly-bird | Original post link

What information is being collected?

| username: dba-kit | Original post link

I recommend upgrading to 6.5.3, which fixes the issue of slow statistics loading. Before 6.5.3, if there were many large tables, the loading would indeed be quite slow. You can directly grep 'init stats' tidb.log to see how long the statistics loading took.

| username: dba-kit | Original post link

It should be the same issue as mine, which was fixed in version 6.5.3. In my production environment testing, the statistics loading time before the upgrade took at least ten minutes (sometimes it would directly fail to load), but after the upgrade, it only takes about 10 seconds.

| username: ti-tiger | Original post link

When TiDB restarted, an exception occurred, causing the statistics not to be correctly saved to TiKV.
After TiDB restarted, the loading speed of the statistics was slow, making it impossible to use the latest statistics during queries.
After TiDB restarted, the version of the statistics was too low, making it impossible to use the latest statistics during queries.
Check TiDB’s logs to see if there are any error or warning messages to identify the cause and impact of the restart.
Manually execute the ANALYZE TABLE statement to force an update of the statistics.
Adjust the tidb_auto_analyze_ratio and tidb_auto_analyze_start_time parameters to control the trigger conditions and timing of automatic analysis.
Adjust the tidb_stats_loading_threads parameter to increase the concurrency of loading statistics.

| username: ajin0514 | Original post link

Take a look at this https://github.com/pingcap/tidb/issues/43385

| username: Z六月星星 | Original post link

Do you need to ANALYZE the table every time you restart? Our company had the same issue before. Are you using TiFlash now?

| username: dba-kit | Original post link

You can refer to this answer. The loading of statistical information in historical versions is relatively slow and may even fail to load. Upgrading to the latest version will resolve this issue as this bug has been fixed.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.