Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: ticdc同步到kafka配置问题
TiDB version 6.1.7
Issue: The downstream Kafka limits the network packet size to 4MB, but the ticdc configuration is as follows (one task for a single table), and it still triggers the error [Message was too large, server rejected it to avoid allocation error.]
“max-batch-size”: “4” (the official documentation states that this parameter is ineffective, the default value is 16, but even if it is 16, it should not report an error)
“max-message-bytes”: “262144”
“protocol”: “maxwell”
PS: Kafka max-message is limited to 4MB and cannot be modified, so only the ticdc configuration can be controlled. Previously encountered this issue with version 4.0.13, the official recommendation was to upgrade, then upgraded to 4.0.16 and still had the issue, now upgraded to 6.1.7 and still have the issue, feeling very frustrated.
You can change the configuration.
This configuration item can be modified. What about your Kafka configuration?
Add these parameters to config/server.properties:
Maximum bytes a broker can receive for a message
message.max.bytes=200000000
Maximum bytes a broker can replicate for a message
replica.fetch.max.bytes=204857600
Maximum bytes a consumer can read for a message
fetch.message.max.bytes=204857600
What I mean is that Kafka is someone else’s system and is not managed by the DBA. The DBA is just a client, and the server side does not support configuration changes as it is managed uniformly by others.
You can build your own Kafka.
Your downstream Kafka is refusing to receive the package.
You should let downstream Kafka make modifications and add limit values.
Data is like water; if the downstream pipe becomes smaller, you either need to limit the CDC flow or increase the capacity of the downstream Kafka pipe.
The protocol being used is Maxwell, which performs batch operations on multiple events. From the source code, it appears that max-batch-size is not effective for this protocol. Maxwell is not an officially GA protocol of TiCDC.
From the provided configuration, the value for max-message-bytes is 262144, which is 256k. If the batch message size exceeds this value, a “message too large” error will occur.
Since your downstream Kafka has a message size limit of 4MB, you can completely ignore this parameter, or set it to 4MB as well.
I tried configuring only max-message-bytes=4MB, but it still reported an error, so now I don’t know how to configure it.
From various protocol tests, Maxwell’s readability is relatively good, so we finally decided on this protocol. Which protocol is officially GA? Which protocol is recommended? For scenarios where the downstream Kafka is limited to 4MB, how should the parameters on the TiCDC side be configured under this protocol?
Sure, is there a version limitation for this? Does it apply to version 4.0.13 or 4.0.16?
We are currently in the testing phase, and if canal-json works in version 4.0.13, then we don’t need to upgrade for now.