Using TiDB 6.1 version of TiCDC for data collection, only collecting a few hundred tables, but experiencing significant delays due to a large number of unrelated DDL operations such as creating databases, tables, and indexes

translator_bot · June 23, 2024, 9:03am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 使用TiDB 6.1版本的TiCDC采集，只采几百张表，但是会受到与采集的表无关的大量的创库创表创索引的ddl操作导致CDC采集一直延迟，延迟很长时间

| username: TiDBer_hWeMryFA

Our company shares a TiDB cluster with over 6,000 databases and 500,000 tables. Our TiCDC experiences significant delays when collecting data from a small subset of our own tables due to a large number of DDL operations on unrelated databases and tables. The official documentation mentions that creating databases, tables, indexes, and large transactions can affect data collection, but we didn’t expect that any DDL operation would impact our collection, even if those DDL operations are on unrelated databases and tables. This design issue with TiCDC is quite severe. It’s almost impossible to collect data normally in a large shared cluster. Is the community currently addressing this issue, and is there a plan to resolve it? This pain point is really significant. If you are planning to improve this, will it be released in the upcoming versions? Will the next version of TiDB improve the impact of DDL operations on TiCDC’s data collection delay? When will the next version be released?

translator_bot · June 23, 2024, 9:03am

| username: TiDBer_hWeMryFA | Original post link

Also, if there are a lot of DDL operations on the collected tables, can the TiCDC latency be improved?

translator_bot · June 23, 2024, 9:03am

| username: TiDBer_hWeMryFA | Original post link

Does anyone have any optimization methods or other solutions for our current situation?

translator_bot · June 23, 2024, 9:03am

| username: h5n1 | Original post link

Could you please share the current cluster size and the type of business?

translator_bot · June 23, 2024, 9:03am

| username: cs58_dba | Original post link

It sounds like a production aggregation database, gathering all instance data together.

translator_bot · June 23, 2024, 9:03am

| username: lonng | Original post link

Hello, could you please use Clinc to collect a set of monitoring data for us to analyze? We want to see if there are any workarounds. Previously, there were relatively few cases with 500k tables. After analyzing this, we will make targeted optimizations.

Clinc PingCAP Clinic 诊断服务简介 | PingCAP 文档中心

translator_bot · June 23, 2024, 9:03am

| username: TiDBer_hWeMryFA | Original post link

Our cluster has 6 TiKVs, with thousands of databases and 500,000 tables because we have partitioned the databases and tables according to customers, with one database per customer and the tables under each database being the same. It is intended to be used as a business database, not a production summary table.

translator_bot · June 23, 2024, 9:03am

| username: TiDBer_hWeMryFA | Original post link

The key issue is this, can it be optimized in future versions? Any DDL operations will affect our data collection, even though those DDL operations are unrelated to our collection tables.

translator_bot · June 23, 2024, 9:03am

| username: asddongmen | Original post link

It will be optimized in future versions.