Manual analyze table takes too long

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 手动analyze table 耗时过长

| username: 像风一样的男子

I have a large table here, and manually analyzing it took 16096.221 seconds. Is there any way to optimize this?

| username: RenlySir | Original post link

Which version of TiDB is it? How large is the table data? Is this table a partitioned table?

| username: 像风一样的男子 | Original post link

With tens of billions of data, not partitioning it might actually be slower if partitioned.

| username: Kongdom | Original post link

:thinking: Enable fast analysis?

| username: TiDBer_jYQINSnf | Original post link

I want to know if this analyze can be executed incrementally, or executed in the background like DDL, because it takes so long to complete. If the connection is lost due to network reasons, will it cause the previous execution to be wasted?

| username: Kongdom | Original post link

Normally, the version is marked here.

| username: tidb菜鸟一只 | Original post link

Refer to here: 统计信息简介 | PingCAP 文档中心

I think for particularly large tables, data should be collected proportionally. For example, WITH 0.1 SAMPLERATE only collects 10% of the data.

Additionally, I suggest modifying the concurrency of analyze. Increase it appropriately without affecting production applications.

| username: RenlySir | Original post link

  1. As mentioned above, adjusting parameters can speed up the analyze process, but it will have an impact on the business.
  2. Increasing the number of TiKV nodes can also appropriately speed it up.
  3. Upgrading to v6.5.8 will also improve the analyze speed.
    It is recommended to upgrade.
| username: zhaokede | Original post link

For large tables with frequent data additions and deletions, slow optimization is normal. If it were faster, it could impact performance.

| username: dba远航 | Original post link

When analyzing, adding a sampling ratio will certainly affect the optimizer’s choice of algorithm.

| username: TiDBer_5cwU0ltE | Original post link

You can increase the ANALYZE concurrency and allocate more resources to the analyze operation. Reference link:

| username: DBAER | Original post link

Check the documentation, there is a concurrency parameter.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.