Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: cdc 设置了ts 和 rowid 同时对多个表进行写操作时,同步到kafka中是一条数据还是多条数据
When cdc sets ts and rowid and writes to multiple tables simultaneously, is the data synchronized to Kafka as a single record or multiple records?
The key difference lies in sharding, as Kafka’s topic can be set with multiple partitions. If there is only one partition, it can easily become a bottleneck, which is why different distribution modes exist.
Regardless of the mode used, data will be sent to Kafka and transmitted according to row records. In practice, it will be sent as many times as it is written, with a semantic guarantee of at-least-once delivery.
Only in the case of unexpected events might duplicate sending occur…
Set multiple partitions. The performance issue is resolved, but it can only ensure that messages within a single partition are ordered. Globally, they are unordered. How can we ensure the global transaction order?
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.