How to Balance cdc Changefeeds Tasks Across 4 Instances

translator_bot · June 22, 2024, 11:27pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: cdc changefeeds 任务如何均衡分发到4个实例

| username: wluckdog

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】v6.1.0
【Reproduction Path】

The original ticdc task cdc instances are xxx.xxx1:8300, xxx.xxx2:8300, with 22 changefeeds tasks.
Added new cdc instances xxx.xxx1:8310, xxx.xxx2:8310, but the changefeeds tables were not allocated to the instances. After restarting cdc, they will be distributed to the two instances on the new cdc-8310 port.
How to distribute changefeeds tasks to 4 cdc instances.

【Resource Configuration】
【Attachments: Screenshots / Logs / Monitoring】

1668655927609

translator_bot · June 22, 2024, 11:27pm

| username: Meditator | Original post link

Currently, CDC tasks cannot be automatically balanced and scheduled. If you need to balance them, remember to handle it manually.
The distribution of CDC tasks is not based on the number of tasks, but rather routes all tables at the underlying level.

You can check this out:

github.com

pingcap/tiflow/blob/master/docs/design/2020-03-04-ticdc-design-and-architecture-cn.md

# 系统架构

TiCDC 是一款通过拉取 TiKV 变更日志实现的 TiDB 增量数据同步工具。具有还原数据到与上游任意 TSO 一致状态的能力，同时提供开放数据协议，支持其他系统订阅数据变更。

TiCDC 集群由多个无状态节点构成，通过 PD 内部的 etcd 实现高可用。集群支持创建多个同步任务，向多个不同的下游进行数据同步。TiCDC 的系统架构如下图所示：

<img src="../media/cdc_architecture.svg?sanitize=true" alt="architecture" width="600"/>

## 系统组件

- TiKV：只输出 kv change logs

  - 内部逻辑拼装 kv change log
  - 提供输出 kv change logs 的接口，发送数据包括实时 change logs 和增量扫的 change logs

- Capture：TiCDC 运行进程，多个 capture 组成一个 TiCDC 集群，负责 kv change log 的同步
  - 每个 capture 负责一部分的 kv change logs 拉取
  - 对拉取的一个或多个 kv change log 进行排序
  - 向下游还原事务或按照 TiCDC open protocol 进行输出

This file has been truncated. show original

translator_bot · June 22, 2024, 11:27pm

| username: wluckdog | Original post link

The underlying layer routes according to the table. Is it fixed once a new task is created?
Why is the CDC instance on port 8300 not assigned any tasks?

translator_bot · June 22, 2024, 11:27pm

| username: Meditator | Original post link

It is fixed, unless the capture (cdc-server) goes down, then it will be balanced to the remaining cdc-servers on a table-by-table basis.

translator_bot · June 22, 2024, 11:27pm

| username: wluckdog | Original post link

How does the number of CDC process instances affect extraction? For example, what is the difference between having 2 CDC instances and 8 instances synchronizing CDC tasks?

translator_bot · June 22, 2024, 11:27pm

| username: asddongmen | Original post link

The tables to be synchronized by changefeed will be allocated to different captures for synchronization based on the number of tables.

For the issue you encountered, please execute the ./cdc version command to get the CDC version information and paste it here to help us troubleshoot.
In theory, CDC should automatically load balance the tables. If it does not automatically load balance:

You can use the openAPI to manually trigger the scheduling. Refer to: TiCDC OpenAPI v1 | PingCAP 文档中心
If the above method does not work, consider pausing and restarting the changefeed.
If neither of the above methods works, finally consider restarting the CDC owner node to refresh all states.

translator_bot · June 22, 2024, 11:27pm

| username: wluckdog | Original post link

I have already restarted the CDC task, but the distribution is still uneven.

I have also adjusted the CDC task through commands, but it becomes uneven again after restarting.

translator_bot · June 22, 2024, 11:27pm

| username: neilshen | Original post link

The issue of uneven scheduling has been optimized in version v6.2.0, which can automatically balance the number of tables on each TiCDC node. In previous versions, it was necessary to use the API to manually trigger balanced scheduling. For the method, refer to asddongmen’s answer.

translator_bot · June 22, 2024, 11:27pm

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.