Issues with ANALYZE TABLE

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: analyze table 的问题

| username: 大钢镚13146

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed that led to the issue
[Encountered Issue: Issue Phenomenon and Impact]
[Resource Configuration]
[Attachments: Screenshots / Logs / Monitoring]

  1. When the system automatically performs analyze table, will I see it clearly on the frontend?
  2. What impact does analyze table have? IO, memory, or CPU? Does it only affect queries?
  3. Today, I performed a bulk insert of data into a table, and this table has been analyzed dozens of times within a day, each time taking more than ten minutes. Is the abnormal CPU usage related to analyze? How should this scenario be optimized? Temporarily disable analyze?
| username: 裤衩儿飞上天 | Original post link

  1. You can execute show analyze status; to check.
  2. analyze table is actually a relatively heavy operation for large tables, and it will definitely have an impact on IO and CPU, as well as read and write operations.
  3. Performing analyze on large tables a dozen times a day will definitely have some impact on the CPU, but it may not be the only factor. You can specify the system’s automatic analyze time, and for large tables, you can create automatic tasks for manual analyze. Manual analyze is much faster than automatic, and it is best to perform analyze during off-peak business hours.
| username: Hacker007 | Original post link

Data updates will trigger the automatic execution of the analyze table, which can affect performance. It is recommended to adjust the execution time of the analyze table to be during off-peak hours.

| username: 随缘天空 | Original post link

The analyze table command can impact cluster performance. You can refer to the following suggestions and try them out. It is also recommended to use the analyze table command during off-peak business hours.

| username: dba-kit | Original post link

You can set the time period for automatic analyze.

| username: dba-kit | Original post link

These two parameters limit the number of concurrent threads for analyze. If you feel that analyze is using too much CPU, you can also reduce them.

| username: Kongdom | Original post link

It is recommended to change the automatic analyze time to be carried out during off-peak business hours.

| username: xingzhenxiang | Original post link

The speed of automatic maintenance is slower than manual, so I almost don’t use automatic maintenance anymore.

| username: Jellybean | Original post link

The execution of the analyze process may consume a considerable amount of memory, CPU, IO, and other resources, especially when dealing with large tables. If it is executed manually, it is strongly recommended to perform it during off-peak business hours to avoid interference and impact on normal usage.

| username: dba-kit | Original post link

There are parameters that can be adjusted, you can refer to the screenshot here.

| username: Kongdom | Original post link

No, the time spent on automatic maintenance and manual maintenance should be the same, right? :thinking: