Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: data.pump目录占据很大空间 可以直接清理么
[TiDB Usage Environment] Production Environment
[TiDB Version] V3.0.16
[Encountered Issues: Problem Phenomenon and Impact]
Issue 1: The version is relatively low, binlog is enabled, but there is no drainer configured in the configuration file. Checking the drainer status also shows empty, but multiple nodes have pump configured. Is it useful to only configure pump?
Issue 2: I manually stopped the pump, and the current pump status is:
In the data directory:
Can this directory be deleted directly?
Question 1: Enabling binlog and configuring only pump will record the operation logs of TiDB. Without configuring drainer, it will not synchronize incremental data to the downstream cluster.
Question 2: It takes up a lot of space. It is not recommended to manually clean it directly, as deleting it may affect the startup of the drainer. It is recommended to use the configuration file for automatic cleaning: TiDB 配置文件描述 | PingCAP 文档中心
If you don’t use binlog, you can directly scale down the pump.
The logs were not synchronized before being cleared, so the connection was lost.
Currently, there is no drainer node.
If there are other backups such as BR, this data can be completely cleared.
Without the drainer component, the pump component doesn’t make much sense either. Let’s scale down based on the situation and pay attention to the binlog-related configuration of the TiDB instance.
Having only the pump component without the drainer component is useless.
Just stop the pump directly, since there is no drainer anyway.
I think it’s possible to back up the deleted data.