Why does auto analyze table always fail, while manually executing analyze table succeeds?

translator_bot · June 21, 2024, 12:55pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 为何auto analyze table总是失败，而人工执行analyze table却能成功呢？

| username: DBRE

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.2.2

[Encountered Problem: Phenomenon and Impact]
Why does auto analyze table always fail, while manually executing analyze table succeeds? Is there any difference between the two analyze operations? This issue has occurred many times in this cluster.

tidb.log regarding analyze logs:

translator_bot · June 21, 2024, 12:55pm

| username: h5n1 | Original post link

Automatic analyze is single-threaded, and large tables fail due to GC issues. When executed manually, there are several build-related variables that can control the degree of concurrency.

translator_bot · June 21, 2024, 12:55pm

| username: DBRE | Original post link

However, manual analyze also takes nearly 2 hours, far exceeding the tidb_gc_life_time setting. Why is that?

translator_bot · June 21, 2024, 12:55pm

| username: xingzhenxiang | Original post link

The level of concurrency is different.

translator_bot · June 21, 2024, 12:55pm

| username: h5n1 | Original post link

From the logs you posted, it can be confirmed that the auto analyze failed due to exceeding the GC safepoint. This is because earlier versions did not consider automatically extending the GC safepoint duration for background tasks. This was improved in version 6.x to reduce GC failures. For long-running foreground tasks, the safepoint duration can be extended automatically. You can try manually analyzing after exceeding the GC safepoint and then check the safepoint time using the following command: pd-ctl service-gc-safepoint.