TiCDC synchronized to Kafka with 100 partitions set, but only 3 partitions have data in Kafka

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ticdc同步到kafka设置了partition100个,但是kafka只看到3个partition有数据

| username: TiDBer_Jack

TiCDC is set to synchronize to Kafka with 100 partitions, but only 3 partitions have data in Kafka. Does this have anything to do with the number of TiCDC nodes? Seeking expert guidance.

| username: Billmay表妹 | Original post link

Can you provide more background information?

| username: Billmay表妹 | Original post link

The situation where only 3 out of 100 partitions have data after setting TiCDC to sync to Kafka may be related to the number of TiCDC nodes. In TiCDC, each TiCDC instance is responsible for monitoring and syncing a portion of the data changes, and these instances write the data changes to different Kafka partitions. Therefore, if only a few nodes in your TiCDC cluster are working, it may result in only some partitions receiving data.

To ensure that all partitions receive data, you can consider the following points to optimize and adjust TiCDC’s configuration:

  1. Increase the number of TiCDC instances: By increasing the number of TiCDC instances, you can improve the concurrent processing capability of data changes, thereby more evenly distributing the data into Kafka partitions.
  2. Adjust TiCDC configuration: You can check the TiCDC configuration file to ensure that the correct Kafka-related parameters are configured, such as sink-partition-num, which specifies the number of partitions TiCDC writes to in Kafka. You can adjust this parameter according to the actual situation to ensure that data is evenly distributed across all partitions.
  3. Monitor TiCDC running status: Regularly monitor the running status of TiCDC, including the working status of each TiCDC instance and the data synchronization status, to promptly identify and resolve issues of uneven data synchronization.
| username: xfworld | Original post link

How many nodes do you have?

| username: 小龙虾爱大龙虾 | Original post link

It should be fine, right? Check if there are any errors.

| username: TiDBer_Jack | Original post link

The issue has been resolved. It was because I automatically created a topic with 3 partitions when I first created the CDC task. After stopping the task, I changed the partitions. The solution was to delete the previous topic and recreate it (this operation was done in a testing environment, not in a production environment).