TiCDC Sends Messages to Kafka with Duplicates

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ticdc发送消息到kfka 消息重复

| username: TiDBer_w0G4KWzg

[Test Environment for TiDB] How to handle duplicate messages when ticdc sends messages to Kafka?

| username: Billmay表妹 | Original post link

Please provide more information~

| username: Billmay表妹 | Original post link

If you encounter duplicate messages when using TiCDC to send messages to Kafka, you can try the following solutions:

  1. Check if the sink-uri parameter of the changefeed is set correctly. If the sink-uri parameter is not set correctly, it may cause duplicate messages. [1]

  2. Check if the max-message-bytes parameter of the changefeed is set too small. If the max-message-bytes parameter is set too small, it may cause messages to be split into multiple parts, leading to duplicate messages. You can try increasing the value of the max-message-bytes parameter. [1]

  3. Check if there are duplicate messages in the Kafka cluster. If there are duplicate messages in the Kafka cluster, it may cause duplicate messages in TiCDC. You can try using the kafka-consumer-groups command to check if there are duplicate messages in the Kafka cluster. If there are duplicate messages, you can use the kafka-consumer-groups command to reset the consumer group’s offset to the earliest or latest offset. [2]

If the above solutions do not resolve the issue, you can try providing more information about the problem, such as the versions of TiCDC and Kafka, the configurations of TiCDC and Kafka, and error messages, so that we can provide more targeted solutions.

| username: cassblanca | Original post link

Follow the posting requirements.