TiDB Log Surge

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb日志猛增

| username: TiDBer_2lwED25q

[Issue Encountered]: In the testing environment, a colleague deployed TiDB, 3 TiKV, and Prometheus on a virtual machine with a 200G disk. Starting this Tuesday, disk space warnings began to appear, and it was found that the TiDB log files were too large, reaching 70G. After manually deleting and retaining only the logs from the last three days, there were still about 40G left. Now, even with only the last two days’ logs retained, there are still 30G. This situation did not occur before, and the logs have increased sharply. Upon entering the log directory of the TiDB component, it was found that a log file is generated every ten minutes or so, with each log file being 300M. As of June 27, there are already nearly 100 log files.

| username: lemonade010 | Original post link

Look at the error in the logs. If you eliminate the related errors, there won’t be any more alerts.

| username: TiDBer_jYQINSnf | Original post link

Take a look at the tail to see what it is. Generally, the most common logs during operation might be related to a slow COPROCESSOR.

| username: juecong | Original post link

Modify the log level

| username: 像风一样的男子 | Original post link

Why don’t you post which logs are flooding the screen?

| username: liusu_sky | Original post link

When deploying previously, the log level was not set. There are a lot of info logs, so I changed the level to error, but I still see info level logs recently, indicating the change did not take effect. I will check if the modification was made in the wrong place and did not take effect.

| username: Kongdom | Original post link

After modifying the configuration, you need to use the reload command to take effect.

| username: zhanggame1 | Original post link

Set the log level to error.

| username: zhh_912 | Original post link

Check the log level, if it doesn’t work, write a script to delete it first.

| username: lemonade010 | Original post link

Find some free time to reload the configuration.

| username: jiayou64 | Original post link

First, take a look at what the errors are.

| username: 健康的腰间盘 | Original post link

Please provide a screenshot of the logs.

| username: TiDBer_QKDdYGfz | Original post link

Increase the log level to error.

| username: TiDBer_7S8XqKfl-1158 | Original post link

You need to reload it for the changes to take effect.

| username: TiDBer_3Cusx9uk-0775 | Original post link

You need to reload, otherwise it won’t take effect.

| username: Kongdom | Original post link

:joy: If the hardware configuration allows, it is recommended not to set it to error, as it can be quite troublesome when troubleshooting issues. Some SQL statements that report errors are logged as warnings. I have encountered situations where SQL execution errors were reported in the monitoring, but they were not visible in the logs. After researching, I found that such SQL-level errors are logged as warnings.

| username: TiDBer_7S8XqKfl-1158 | Original post link

Check the TiDB configuration file and adjust the log level to an appropriate level (e.g., info or warn) to avoid using the debug level.

| username: TiDBer_3Cusx9uk-0775 | Original post link

Adjust the log level, and try to avoid using ERROR.

| username: TiDBer_7S8XqKfl-1158 | Original post link

If the logs of other nodes have not surged, then check if the business is only using this one TiKV node.

| username: TiDBer_3Cusx9uk-0775 | Original post link

Could it be a load balancing issue? Did everything end up on one host?