Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: ticdc传输数据到kafka中一条record信息中包含了两张不同表的信息
【TiDB Environment】Test environment
【TiDB Version】5.3.1
【Encountered Problem】
When ticdc transfers data to Kafka, a single record contains information from two different tables. Is this normal, and is there any setting to avoid this situation?
【Reproduction Path】TICDC creates a task from TiDB to Kafka
【Problem Phenomenon and Impact】
The destination is Kafka. During our local testing, in most cases, the key and value of a record contain information from the same table. We parse the key of the Kafka record to determine the table name. Currently, the Kafka record’s key contains the names of two tables, causing data from different tables to be processed as data from the same table. Additionally, there’s another issue with the value.
D{{“u”:{“AccountAttr”:{ and DY{“u”:{“AccountAttr”:{. I would like to ask under what circumstances it is D{{ and under what circumstances it is DY{.
We hope that the key and value of a record in the Kafka topic only contain information from the same table.
You only need to set the required tables for this.
Hello, I have configured it this way, but the key of a record’s data contains information from both tables a_a_n613b and a_a_n_613a.
Could you please explain the record value data?
D{{“u”:{“AccountAttr”:{ and DY{“u”:{“AccountAttr”:{, I would like to know under what circumstances it is D{{ and under what circumstances it is DY{.
Is there any related documentation for this?
First question, if you only want the data of a single table to appear in the topic, you can configure the filter to a single table.
Second question, please explain in more detail.
The first question is whether to start a ticdc task for a single table or configure it in the configuration file like this:
[filter]
rules = [‘dp_test.*a_tt’, ‘dp_test.a_a_n613b’, ‘dp_test.a_a_n_613a’]
Is this configuration sufficient?
The second question is about consuming data from the topic. The value retrieved is shown in the image. I would like to ask under what circumstances does it start with “D{{” and under what circumstances does it start with “DY{”?
I am not sure about this either, it looks like all the data is being inserted.
“DY” represents the length of the value in ASCII.
For more details, see: TiCDC Open Protocol | PingCAP Docs
Hello, for example, we have configured a ticdc task to synchronize data from 3 tables: TABLE A, B, and C.
[filter]
rules = [‘dp_test.A’, ‘dp_test.B’, ‘dp_test.C’]
The problem we are encountering now is that in a certain record in the topic, the key contains information from both tables A and B, and the value also contains data from both tables A and B, appearing in the same record.
We believe that the key and value of a record should only contain information changes from one table, not two tables.
Could you please advise on how to avoid this situation? Is there something wrong with our settings? Thank you.
If the configuration is like this, it is normal.
Excuse me, what kind of configuration can ensure that each record in the topic contains key values with information from only one table?
If it is not possible to ensure that a record’s key contains information from only one table, then when parsing the record, since the key value contains information from two tables, how can the information from the two tables be matched one by one? Seeking advice. Thank you
[filter]
rules = [‘dp_test.A’]
Only synchronize one table, split the three tables or use version 6.1 https://docs.pingcap.com/zh/tidb/stable/manage-ticdc#topic-分发器
I want to confirm whether ticdc version 5.3.0 supports the Canal-JSON Protocol.
Is there any documentation available for Canal version 5.3.0?
Version 5.3 is not supported.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.