Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: cdc怎么配置kafka的batch.size

How to configure the batch.size of Kafka in CDC
Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: cdc怎么配置kafka的batch.size
How to configure the batch.size of Kafka in CDC
tidb-cdc: Indicates the topic
kafka-version: Downstream Kafka version number (optional, default value is 2.4.0, currently the minimum supported version is 0.11.0.2)
kafka-client-id: Specifies the Kafka client ID for the synchronization task (optional, default value is TiCDC_sarama_producer_ID of the synchronization task)
partition-num: Number of downstream Kafka partitions (optional, cannot be greater than the actual number of partitions. If not filled, the partition number will be automatically obtained)
protocol: Indicates the message protocol output to Kafka, optional values are default, canal, avro, maxwell, canal-json (default value is default)
max-message-bytes: Maximum amount of data sent to the Kafka broker each time (optional, default value is 64MB)
replication-factor: Number of replicas for Kafka message storage (optional, default value is 1)
ca: Path to the CA certificate file required to connect to the downstream Kafka instance (optional)
cert: Path to the certificate file required to connect to the downstream Kafka instance (optional)
key: Path to the certificate key file required to connect to the downstream Kafka instance (optional)
Which parameters do you need?
You can refer to the documentation: TiCDC Changefeed 命令行参数和配置参数 | PingCAP 文档中心
After configuring CDC, I want Kafka messages to be sent only when each batch size reaches 1M.
How about trying this parameter for the URI?
# Kafka Producer Configuration Example
# Kafka Producer Configuration
bootstrap.servers=kafka-broker1:9092,kafka-broker2:9092
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer
# Set batch.size parameter
batch.size=1048576
# TiCDC Configuration File Example
# TiCDC Service Listening Address
server:
# Listening Address
host: 0.0.0.0
# Listening Port
port: 8300
# CDC Configuration
cdc:
# Changefeed Configuration
changefeed:
# CDC Configuration Name
id: "example-changefeed"
# Target Storage Type, here is Kafka
sink-uri: "kafka://kafka-broker:9092/example-topic"
# Other CDC related configurations
...
# Kafka Producer Configuration File Path
kafka-producer.config: "/path/to/kafka/producer.properties"