Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 多个TiDB之间的统计信息(Statistic)是如何同步的?
[TiDB Version] Latest version or older versions
[Encountered Problem: Phenomenon and Impact]
I have read multiple TiDB-related documents and blogs, but I haven’t seen how the statistics (Statistic) of multiple TiDBs are propagated or synchronized between TiDB clusters. Could you please clarify:
- If I deploy multiple TiDBs, will the statistics (Statistic) of each TiDB be synchronized or exchanged between the clusters?
- If the statistics are synchronized or propagated between the clusters, how is it implemented?
- Will there be discrepancies in the statistics between multiple TiDBs? If so, will it cause any adverse effects?
The table’s statistics are stored in TiKV.
- If there are multiple TiDB servers, their statistics are read from TiKV, so there is no synchronization issue.
- No synchronization is needed because everything is persisted in TiKV.
- There might be deviations. First, each TiDB server loads statistics into memory upon startup, and if they start at different times, the loaded statistics might differ. Additionally, even if the statistics are in memory, sometimes the latest statistics need to be loaded from TiKV, which can occasionally fail due to timeout. This means that different TiDB servers might not always successfully load the statistics, leading to different execution plans. This could result in SQL execution not following the most optimized plan. In such cases, manual intervention might be necessary.
Table-level statistics import and export can be performed.
TiDB does not have it, TiKV synchronizes directly.
Statistics will be persisted to TiKV, and TiDB nodes will synchronize from TiKV.
Learned a lot, thank you!
You can refer to the documentation 常规统计信息 | PingCAP 文档中心
Statistics will be persisted to TiKV, and TiDB nodes will read from TiKV.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.