Is there an automatic cleanup parameter setting for TiDB's monitoring historical data?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb的监控历史数据,是否有自动清除参数设置?

| username: yulei7633

As time goes by, the amount of data in these tables will increase, which may affect the system’s performance. Is there a parameter to set to retain only the data for the most recent number of days? Similar to the settings for tikv, pd, tidb:
server_configs:
tidb:
log.file.max-days: 15
tikv:
log.file.max-days: 15
pd:
log.file.max-days: 15
tiflash:
raftstore-proxy.log.file.max-days: 15

| username: Jolyne | Original post link

It seems that there is no such setting, but you can use TTL to periodically delete data.

| username: 啦啦啦啦啦 | Original post link

These two tables do not store monitoring information. The first one, slow log, is obtained from the slow log file and can be cleaned up using log-related expiration parameters. The second one is to check the number of different values in a column and other information.

| username: 像风一样的男子 | Original post link

The table contains the locally loaded slow.log. You can check the log folders on each TiDB node and write a scheduled task to clean them up.

| username: zhanggame1 | Original post link

Slow queries are stored in local slow query log files, you can manually clean them up.

| username: 小龙虾爱大龙虾 | Original post link

The slow log files that the information_schema.cluster_slow_query table maps to are also part of the TiDB log files and are similarly affected by the log.file.max-days and log.file.max-backups parameters in the TiDB configuration file. On the other hand, stats_histograms are part of the statistics information, which is related to the number of tables and columns you have and will not grow indefinitely.

| username: 路在何chu | Original post link

Write a script to regularly clean up slow logs.

| username: Kongdom | Original post link

Both of these are system tables. The first one parses slow query logs, which are likely cleared along with the logs. The second one is statistical information, which should not grow indefinitely.

The CLUSTER_SLOW_QUERY table provides information related to slow queries from all nodes in the cluster. Its content is derived from parsing TiDB slow query logs, and its usage is the same as the SLOW_QUERY table.

  • stats_histograms are histograms of statistical information.
| username: yulei7633 | Original post link

Unified reply: After reading everyone’s replies, I roughly understand what you mean. Thank you all.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.