This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: TiDB5.4.X:CDC任务报错:TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
【TiDB Usage Environment】Testing
【TiDB Version】5.4.0 - 5.4.3
【Encountered Issue】TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
【Issue Phenomenon and Impact】
- Issue Phenomenon
After upgrading from 4.0.15 to 5.4.3, the previously normal CDC task encountered an error.
CDC task error:
“message”: “[CDC:ErrKafkaNewSaramaProducer]new sarama producer: [CDC:ErrKafkaInvalidConfig]because TiCDC Kafka producer’s request.required.acks
defaults to -1, TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
: replication-factor
cannot be smaller than the min.insync.replicas
of topic”
CDC log error:
[ERROR] [changefeed.go:119] [“an error occurred in Owner”] [changefeed=testcdc0-testcdc-t5] [error=“[CDC:ErrKafkaNewSaramaProducer]new sarama producer: [CDC:ErrKafkaInvalidConfig]because TiCDC Kafka producer’s request.required.acks
defaults to -1, TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
: replication-factor
cannot be smaller than the min.insync.replicas
of topic”]
- Detailed Testing
From the description, when request.required.acks
is set to -1, Kafka’s parameter replication-factor
cannot be less than min.insync.replicas
. However, actual testing revealed the following situations:
When min.insync.replicas
is 1, setting replication-factor
to 1, 2, or 3 results in normal synchronization.
When min.insync.replicas
is 2, setting replication-factor
to 1, 2, or 3 results in errors.
When min.insync.replicas
is 3, setting replication-factor
to 1, 2, or 3 results in errors.
The issue here is that when min.insync.replicas
is 2 and replication-factor
is set to 3, both new and old CDC tasks report errors. In production environments, Kafka is often configured this way, so such errors can significantly impact CDC tasks.
Additionally, it is unclear why when min.insync.replicas
is 2 and replication-factor
is set to 2, CDC tasks report errors, but when min.insync.replicas
is 1 and replication-factor
is set to 1, CDC tasks run normally.
- Involved Versions
Testing shows that whether upgrading to 5.4.X or directly installing the 5.4.X version, this CDC error occurs. However, this issue does not appear in versions 4.0.X, 5.0.X to 5.3.X.
In CDC 4.x, the default value of the request.required.acks
parameter for the CDC producer is 1.
In CDC 5.4.x, the default value of the request.required.acks
parameter for the CDC producer is changed to -1, which means that the message is considered successfully sent only after all followers have acknowledged it.
Additionally, the min.insync.replicas
parameter only takes effect when request.required.acks
is set to -1.
This doesn’t seem to meet expectations. What is the default replication-factor of your cluster? What does the sink-uri look like when you create the changefeed? TiCDC will only report an error when the replication-factor < min.insync.replicas. It won’t report an error if it is equal to or greater than. Is your topic new or old? If it’s old, what was the replication-factor when it was created?
The topic was manually created in advance: --create --zookeeper XXX --replication-factor 3 --partitions 3 --topic test1
The sink-uri is like this:
/usr/bin/cdc cli changefeed create --pd=XXX --start-ts=XXX --sink-uri="kafka://XXX/test1?message.max.bytes=2147483648?partition-num=3
Try adding replication-factor=3 in the sink-uri.
Changed the sink-uri to this, but still getting an error:
/usr/bin/cdc cli changefeed create --pd=XXX --start-ts=XXX --sink-uri="kafka://XXX/test1?message.max.bytes=2147483648?partition-num=3?replication-factor=3
This is very strange. Can you post the completed creation process and error logs after the changes? Also, please check the parameter information of your topic.
Create Kafka topic:
/data/kafka/kafka_2.12-2.4.1/bin/ --create --zookeeper XXX:2181 --replication-factor 3 --partitions 3 --config min.insync.replicas=2 --topic test3
View topic properties:
/data/kafka/kafka_2.12-2.4.1/bin/ --describe --bootstrap-server XXX:9092 --topic test3
Topic: test3 PartitionCount: 3 ReplicationFactor: 3 Configs: min.insync.replicas=2,segment.bytes=1073741824
Topic: test3 Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
Topic: test3 Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1
Topic: test3 Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2
Create CDC task:
/usr/bin/cdc cli changefeed create --pd=XXX --start-ts=436935319064936449 --sink-uri=“kafka://XXX:9092/test3?message.max.bytes=2147483648?partition-num=3?replication-factor=3” --changefeed-id=“testcdc0” --config=/home/tidb/testcdc_yaml/testcdc0_testcdc_t0.yaml
Configuration file:
case-sensitive = true
enable-old-value = true
rules = [
worker-num = 8
dispatchers = [
{matcher = [
], dispatcher = “rowid”},
protocol = “canal-json”
enable = false
replica-id = 1
Task creation return:
Create changefeed successfully!
ID: testcdc0
Info: {“sink-uri”:“kafka://XXX:9092/test3?message.max.bytes=2147483648?partition-num=3?replication-factor=3”,“opts”:{},“create-time”:“2022-10-26T17:19:45.759358278+08:00”,“start-ts”:436935319064936449,“target-ts”:0,“admin-job-type”:0,“sort-engine”:“unified”,“sort-dir”:“”,“config”:{“case-sensitive”:true,“enable-old-value”:true,“force-replicate”:false,“check-gc-safe-point”:true,“filter”:{“rules”:[“testcdc0.testcdc_t0”],“ignore-txn-start-ts”:null},“mounter”:{“worker-num”:8},“sink”:{“dispatchers”:[{“matcher”:[“testcdc0.testcdc_t0”],“dispatcher”:“rowid”}],“protocol”:“canal-json”},“cyclic-replication”:{“enable”:false,“replica-id”:1,“filter-replica-ids”:null,“id-buckets”:0,“sync-ddl”:false},“scheduler”:{“type”:“table-number”,“polling-time”:-1}},“state”:“normal”,“history”:null,“error”:null,“sync-point-enabled”:false,“sync-point-interval”:600000000000,“creator-version”:“v4.0.16”}
Task error message:
cdc cli changefeed query -s --pd=http://XXX:2379 --changefeed-id=testcdc0
“state”: “error”,
“tso”: 436935319064936449,
“checkpoint”: “2022-10-26 17:19:26.892”,
“error”: {
“addr”: “”,
“code”: “CDC:ErrKafkaNewSaramaProducer”,
“message”: “[CDC:ErrKafkaNewSaramaProducer]new sarama producer: [CDC:ErrKafkaInvalidConfig]because TiCDC Kafka producer’s request.required.acks
defaults to -1, TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
: replication-factor
cannot be smaller than the min.insync.replicas
of topic”
Log error:
[2022/10/26 17:20:16.193 +08:00] [ERROR] [kafka.go:571] [“replication-factor
cannot be smaller than the min.insync.replicas
of topic”] [replicationFactor=1] [minInsyncReplicas=2]
[2022/10/26 17:20:16.581 +08:00] [ERROR] [changefeed.go:119] [“an error occurred in Owner”] [changefeed=testcdc0] [error=“[CDC:ErrKafkaNewSaramaProducer]new sarama producer: [CDC:ErrKafkaInvalidConfig]because TiCDC Kafka producer’s request.required.acks
defaults to -1, TiCDC cannot deliver messages when the replication-factor
is less than min.insync.replicas
: replication-factor
cannot be smaller than the min.insync.replicas
of topic”]
I found this prompt in the log error: [replicationFactor=1] [minInsyncReplicas=2], but the topic’s replicationFactor is clearly set to 3.
The error should be here; I don’t know why the created replicationFactor is 3, but CDC thinks replicationFactor=1.
This also explains why when min.insync.replicas is 2 and replication-factor is set to 2, the CDC task reports an error, but when min.insync.replicas is 1 and replication-factor is set to 1, the CDC task works fine.
It is probably because CDC always thinks the replicationFactor value is 1.
Your sink-uri is incorrect. The format for passing parameters should be xxx?a1=1&a2=2&a3=3.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.