DM Syncing JSON Type Fields Results in Base64 Encryption (syncer)

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: DM 同步json类型字段出现base64加密(syncer)

| username: Jjjjayson_zeng

【TiDB Usage Environment】Production Environment
【TiDB Version】tidb v.6.5.1 dm v5.4.0
【Reproduction Path】dm syncer stage
【Encountered Problem: Problem Phenomenon and Impact】
Data errors occurred, affecting usage
【Resource Configuration】
【Attachment: Screenshot/Log/Monitoring】

| username: Billmay表妹 | Original post link

Why are the versions inconsistent?

| username: Billmay表妹 | Original post link

When TiDB DM synchronizes JSON type fields, it transmits and stores JSON data using base64 encoding. This is because JSON data may contain special characters, such as double quotes and newline characters, which could cause data parsing errors if transmitted and stored directly. Therefore, to ensure data correctness, TiDB DM encodes JSON data in base64 before transmission and storage.

In TiDB DM, this functionality is implemented by the Syncer component. Syncer detects JSON type data and, if it contains special characters, encodes it in base64. After data synchronization is complete, the DM Worker decodes the base64 encoded data back into the original JSON format and writes it to the downstream TiDB.

It is important to note that if you find JSON type data encoded in base64 during data synchronization with TiDB DM, this is normal behavior and does not require additional handling. When reading the data, you need to decode the base64 encoded data to obtain the original JSON data.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.