Phenomenon: The drainer component in the cluster frequently goes down and is difficult to restart and recover.
Investigation: The error log shows the phrase “receive big size binlog” nearby, with the binlog size being over 100MB. I saw a similar issue in the forum, refer to the conclusion 3 in the article “What does the SQL synchronized from upstream to Kafka look like in Kafka - qhd2004’s column - 专栏 - 上游sql通过drainer同步到kafka时在kafka中是什么样子的 | TiDB 社区”.
Others: drainer and pump are configured with default settings.
Consultation: What are the strategies or suggestions for handling such issues? Is there a unified solution to this problem, or a way to prevent the drainer from frequently exiting abnormally?
warn code location, this is a warning indicating that a single binlog entry is too large, possibly due to a large transaction. This error cannot directly pinpoint the cause of the crash.
Considering the version is v6.1.0, stop using tidb-binlog and switch to TiCDC. The binlog is no longer officially maintained.
If you still want to investigate the root cause, it’s recommended to provide a clinic. This way, both logs and monitoring can be reviewed.
Suggestions regarding TiDB-Binlog → Read the official documentation once, and then check the common FAQ. That should be sufficient. Binlog is quite old now.
There is a bug in TiDB’s binlog in version 5.4.0 and below. After experiencing a large transaction, the memory of the drainer process often increases. You can check if this phenomenon occurs.