How to Use Golang to Retrieve Data Change Information from TiCDC

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 怎么使用golang来获取ticdc的数据变化信息

| username: TiDBer_5VobY5Th

How to use Golang to write code to get TiCDC change data? The documentation now introduces using the CDC CLI command line for management.

| username: zhh_912 | Original post link

In Golang, to obtain data change information from TiCDC, you need to use TiCDC’s Open API. Here is a simple example demonstrating how to use Golang to send an HTTP request and get data change information from TiCDC.

First, ensure that you have started the TiCDC service and know the API endpoint.

| username: 这里介绍不了我 | Original post link

Refer to Open API TiCDC OpenAPI v2 | PingCAP Documentation Center

| username: TiDBer_QYr0vohO | Original post link

Call the CDC’s open API.

| username: zhaokede | Original post link

Not familiar with the G language, actually, they are all interconnected, all calling their external APIs to obtain information.

| username: 不想干活 | Original post link

TiCDC provides OpenAPI functionality, allowing you to perform query and maintenance operations on the TiCDC cluster through OpenAPI v2. The functionality of OpenAPI is a subset of the cdc cli tool.

| username: TiDBer_5VobY5Th | Original post link

It seems that OpenAPI only controls functions and does not have the ability to obtain and process change data on its own.

| username: TiDBer_5VobY5Th | Original post link

I want to handle the changing data myself and decide how to apply these changes. It seems that these features are not available in the OpenAPI.

| username: 这里介绍不了我 | Original post link

Then you can write it to be consumed by Kafka.

| username: TiDBer_H5NdJb5Q | Original post link

Is it necessary to write your own sink? It’s not necessary. It’s more reliable to sink to Kafka first and then read from Kafka.

| username: TiDBer_5VobY5Th | Original post link

Writing to Kafka mainly requires configuring Kafka, and there is no Kafka in the environment. So I want to receive the changed data myself.

| username: TiDBer_5VobY5Th | Original post link

The main issue is that TiCDC’s functionality for applying data to the database is not very reliable and prone to errors. For example, when inserting data, if the downstream database already has a record with the same ID, it will fail and stop.

| username: Jellybean | Original post link

CDC supports reentrant synchronization. What specific error are you referring to?

| username: TiDBer_5VobY5Th | Original post link

For example, during the CDC process, if a record is generated in the TiDB database with an ID of 5, but there is already a record with an ID of 5 in the downstream TiDB database, the CDC will fail, causing the entire task to fail.

| username: Jack-li | Original post link

You can use Go to interact with TiDB’s API.

| username: Jellybean | Original post link

This issue generally does not occur in this scenario. What is the safe_mode of the cluster’s TiCDC? If it is enabled, it will achieve idempotent writes to the downstream by converting to replace.