How to configure Kafka's batch.size in CDC

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: cdc怎么配置kafka的batch.size

| username: TiDBer_n74n5Bz6

How to configure the batch.size of Kafka in CDC

| username: Fly-bird | Original post link

tidb-cdc: Indicates the topic
kafka-version: Downstream Kafka version number (optional, default value is 2.4.0, currently the minimum supported version is 0.11.0.2)
kafka-client-id: Specifies the Kafka client ID for the synchronization task (optional, default value is TiCDC_sarama_producer_ID of the synchronization task)
partition-num: Number of downstream Kafka partitions (optional, cannot be greater than the actual number of partitions. If not filled, the partition number will be automatically obtained)
protocol: Indicates the message protocol output to Kafka, optional values are default, canal, avro, maxwell, canal-json (default value is default)
max-message-bytes: Maximum amount of data sent to the Kafka broker each time (optional, default value is 64MB)
replication-factor: Number of replicas for Kafka message storage (optional, default value is 1)
ca: Path to the CA certificate file required to connect to the downstream Kafka instance (optional)
cert: Path to the certificate file required to connect to the downstream Kafka instance (optional)
key: Path to the certificate key file required to connect to the downstream Kafka instance (optional)

Which parameters do you need?

| username: 连连看db | Original post link

You can refer to the documentation: TiCDC Changefeed 命令行参数和配置参数 | PingCAP 文档中心

| username: TiDBer_n74n5Bz6 | Original post link

After configuring CDC, I want Kafka messages to be sent only when each batch size reaches 1M.

| username: WalterWj | Original post link

It looks like the control is the number of rows. :thinking:

| username: WalterWj | Original post link

How about trying this parameter for the URI?


But it doesn’t seem to be what you want. It looks like you want to batch 1MB for sending?

| username: ljluestc | Original post link

# Kafka Producer Configuration Example

# Kafka Producer Configuration
bootstrap.servers=kafka-broker1:9092,kafka-broker2:9092
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer

# Set batch.size parameter
batch.size=1048576
# TiCDC Configuration File Example

# TiCDC Service Listening Address
server:
  # Listening Address
  host: 0.0.0.0
  # Listening Port
  port: 8300

# CDC Configuration
cdc:
  # Changefeed Configuration
  changefeed:
    # CDC Configuration Name
    id: "example-changefeed"
    # Target Storage Type, here is Kafka
    sink-uri: "kafka://kafka-broker:9092/example-topic"
    # Other CDC related configurations
    ...
    # Kafka Producer Configuration File Path
    kafka-producer.config: "/path/to/kafka/producer.properties"
| username: redgame | Original post link

Where is this document? I couldn’t find it…