Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 线上pt-osc变更模式下,不开启dm的pt-osc功能如何才能追踪到rename语句表结构变更
[TiDB Version]
DM: v5.4.1
[Scenario]
Due to the large volume of our data, the upstream has performed database sharding. The disk where our DM tool is located cannot support multiple data sources for full synchronization simultaneously, so we split it into multiple tasks for synchronization. For example, student_01_db is split into one task, and student_02_db is split into another task.
Therefore, since we are synchronizing data to the same table in TiDB from multiple tasks simultaneously, we cannot use the pt-osc feature of the DM tool to automatically change the table structure.
When the upstream MySQL directly performs DDL, dm_meta can monitor the table structure changes. However, when the upstream uses pt-osc, if pt-osc is not enabled in dm_meta, it cannot monitor such table structure changes like rename, resulting in dm_meta still having the old information. This causes field inconsistency errors when writing to the downstream.
Does DM have compatibility for this scenario?
Currently, we are considering modifying it ourselves to make it compatible if it is not already.
For 1. DM worker can be deployed on multiple machines. Are you saying that the combined disk space of the entire cluster is still not enough for a full migration in this scenario?
For example: Our upstream student_tab table has 3000 sub-tables. Since the full synchronization phase is task-based, our disk cannot accommodate 3000 sub-tables at once. Therefore, we split it into two tasks, each with 1500 sub-tables, and perform full + incremental synchronization separately.
For 2.3
Although the DM tool can track upstream rename table structure changes after enabling online, as mentioned in point 1, we have two tasks synchronizing one table in TiDB simultaneously. Therefore, enabling DM’s pt-osc might cause the TiDB table to execute the DDL twice.
For 4
The current issue is that due to cross-task synchronization, we cannot enable online and have to manually change the TiDB table structure. As a result, dm_meta cannot track the rename table structure changes caused by upstream pt-osc operations.
You can also try using s3 as the full directory. This feature currently has no documentation. The usage is to change the loader’s dir configuration
to an s3 address
Looking forward to your feedback on this feature!
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.