Issue with TiCDC Sink to Kafka Maximum Message Size

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiCDCsink到kafka最大消息体问题

| username: juecong

[TiDB Usage Environment] Production Environment
[TiDB Version] 5.7.25-TiDB-v6.1.0
[Encountered Issue: Phenomenon and Impact]
Using ticdc to synchronize incremental data and write it into Kafka. The cdc is created as follows:

tiup cdc cli changefeed create --pd=http://192.168.60.4:2379 --sink-uri="kafka://192.168.60.207:9092/tidb-to-kafka?protocol=canal-json&kafka-version=3.2.0&partition-num=3&max-message-bytes=2097152&replication-factor=1&enable-tidb-extension=true" --changefeed-id="tidb-to-kafka" --sort-engine="unified" --config /home/ticdc_changefeed.toml

The maximum message size is set to 2M, but Kafka often reports the error org.apache.kafka.common.network.InvalidReceiveException: Invalid receive (size = 1195725856 larger than 67108868). After setting it to 2M, a batch of data should be at most 2M when sent to Kafka, so why is there such a large message size? Kafka is not used for anything else, only for cdc.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: xfworld | Original post link

Refer to TiCDC 常见问题解答 | PingCAP 文档中心

| username: juecong | Original post link

Okay, thank you. Could this situation be caused by large transactions? And then the 2M size setting didn’t take effect?

| username: juecong | Original post link

Setting Kafka parameters is a solution, but Kafka only has 6GB of memory. A single message takes up 1-2GB, so if a few messages come in, Kafka will run out of memory (OOM). Additionally, could this be caused by large transactions, causing the 2MB size setting to not take effect? Thanks for the reply.

| username: xfworld | Original post link

Refer to this passage

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.