TiCDC Unable to Synchronize Data to Downstream Database

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: ticdc 无法同步数据到下游数据库

| username: vesa

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] v6.1.0
[Reproduction Path]

  1. Use ticdc to synchronize data to the downstream TiDB;
  2. Status of the source database:

    The red box indicates the cdc node used.
  3. Table structure for synchronization:
CREATE TABLE `ads_test_pushuo_file` (
  `crticle_content_id` bigint(20) NOT NULL,
  `filename` varchar(100) NOT NULL,
  `content` text DEFAULT NULL,
  PRIMARY KEY (`crticle_content_id`) /*T![clustered_index] NONCLUSTERED */,
  UNIQUE KEY `primary_index` (`crticle_content_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
  1. Synchronization command
    Enter the server of the node 172.16.2.3:8300 and run the command:
/data-tidb/tidb-deploy/ticdc-8300/bin/cdc cli changefeed create --pd=http://172.16.2.13:2379 --sink-uri="tidb://root:playlist6@172.1.1.12:4000/" --changefeed-id="ads-test-file"  --config kmxxg_content_security.toml
  1. Configuration file:
case-sensitive = true

enable-old-value = true

[filter]

rules = ['kmxxg_content_security.ads_test_pushuo_file']

[mounter]
worker-num = 16

[sink]
protocol = "default"

[cyclic-replication]
enable = false
replica-id = 1
filter-replica-ids = [2,3]
sync-ddl = true
  1. Check the cdc task status:
    1671077090555
  2. Check the detailed cdc task:
{
  "info": {
    "upstream-id": 0,
    "sink-uri": "tidb://root:playlist6@172.1.1.12:4000/",
    "opts": {},
    "create-time": "2022-12-13T14:45:10.378326772+08:00",
    "start-ts": 438020056014913540,
    "target-ts": 0,
    "admin-job-type": 0,
    "sort-engine": "unified",
    "sort-dir": "",
    "config": {
      "case-sensitive": true,
      "enable-old-value": true,
      "force-replicate": false,
      "check-gc-safe-point": true,
      "filter": {
        "rules": [
          "kmxxg_content_security.ads_test_pushuo_file"
        ],
        "ignore-txn-start-ts": null
      },
      "mounter": {
        "worker-num": 16
      },
      "sink": {
        "dispatchers": null,
        "protocol": "default",
        "column-selectors": null,
        "schema-registry": ""
      },
      "cyclic-replication": {
        "enable": false,
        "replica-id": 1,
        "filter-replica-ids": [
          2,
          3
        ],
        "id-buckets": 0,
        "sync-ddl": true
      },
      "consistent": {
        "level": "none",
        "max-log-size": 64,
        "flush-interval": 1000,
        "storage": ""
      }
    },
    "state": "normal",
    "error": null,
    "sync-point-enabled": false,
    "sync-point-interval": 600000000000,
    "creator-version": "v6.1.0"
  },
  "status": {
    "resolved-ts": 438062843066580994,
    "checkpoint-ts": 438062843066580994,
    "admin-job-type": 0
  },
  "count": 0,
  "task-status": []
}

[Encountered Problem: Problem Phenomenon and Impact]

  1. Insert data into the source database’s ads_test_pushuo_file;
  2. Check the target database’s ads_test_pushuo_file, no data has been added;

Could you please identify where the synchronization issue might be based on the configuration?

[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]

| username: Meditator | Original post link

  1. The query task output is quite strange; the task-status object output is blank. Normally, it would show which cdc-server (capture) is handling the task.
  2. Check if the cdc-server cluster is functioning properly by reviewing the logs of each node.
| username: vesa | Original post link

There are no obvious synchronization records in the CDC logs, but there are records for DLL statements in the source database. Does PD need to use a leader?

| username: weixiaobing | Original post link

The sink is not configured, you can add it.
https://docs.pingcap.com/zh/tidb/v5.4/manage-ticdc
[sink]
dispatchers = [
{matcher = [‘wxb.t’], dispatcher = “rowid”},
]

| username: Meditator | Original post link

It seems that the cdc-server cluster has issues accessing the pd cluster. Check if there are any network-related problems.

| username: vesa | Original post link

There is no problem with the communication, and using this CDC to synchronize to Kafka is normal. The task-status object output is also available. May I ask if there is any issue with the TOML configuration file?

| username: yilong | Original post link

The command specifies --pd=http://172.16.2.13:2379, but it seems that PD does not have this address? There are only 2.1, 2.2, and 2.3, right?