Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 使用DTS工具从MySQL同步到TiDB,json类型字段里面顺序错乱,导致数据校验不一致
[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] 8.1
[Reproduction Path] After using Alibaba Cloud DTS tool to synchronize MySQL to TiDB, it was found that the data verification failed. The inconsistencies are all in JSON type fields. For example, the data on the MySQL side is {“c”: xxx, “b”: xxx, “d”: xxx, “a”: xxx}, but after synchronization to TiDB, it becomes {“a”: xx, “b”: xx, “c”: xx, “d”: xx}. What is the reason for this?
It feels like the synchronization tool sorted the JSON fields and then synchronized them to TiDB.
If you manually insert a record into TiDB, will the JSON field still change?
Testing direct writes to TiDB, the order has changed, the issue should be with TiDB.
Is it possible to convert it to a string on the source end and then restore it to JSON on the destination end? This way, the format and order won’t be lost.
The test results are as follows:
mysql:
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 38
Server version: 8.0.35 MySQL Community Server - GPL
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
(root@10.110.66.172) [(none)]>use test
(root@10.110.66.172) [(none)]>use test;
Database changed
(root@10.110.66.172) [test]>CREATE TABLE a (
-> id INT AUTO_INCREMENT PRIMARY KEY,
-> data JSON
-> );
Query OK, 0 rows affected (0.05 sec)
(root@10.110.66.172) [test]>INSERT INTO a (data)
-> VALUES ('{"name": "Xiao Ming", "age": 30, "city": "Beijing"}');
Query OK, 1 row affected (0.01 sec)
(root@10.110.66.172) [test]>select * from a;
+----+-------------------------------------------------+
| id | data |
+----+-------------------------------------------------+
| 1 | {"age": 30, "city": "Beijing", "name": "Xiao Ming"} |
+----+-------------------------------------------------+
1 row in set (0.00 sec)
tidb:
(root@10.102.58.180) [(none)]>use test;
Database changed
(root@10.102.58.180) [test]>CREATE TABLE a (
-> id INT AUTO_INCREMENT PRIMARY KEY,
-> data JSON
-> );
Query OK, 0 rows affected (0.14 sec)
(root@10.102.58.180) [test]>INSERT INTO a (data)
-> VALUES ('{"name": "Xiao Ming", "age": 30, "city": "Beijing"}');
Query OK, 1 row affected (0.01 sec)
(root@10.102.58.180) [test]>select * form a;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your TiDB version for the right syntax to use line 1 column 13 near "form a"
(root@10.102.58.180) [test]>select * from a;
+----+-------------------------------------------------+
| id | data |
+----+-------------------------------------------------+
| 1 | {"age": 30, "city": "Beijing", "name": "Xiao Ming"} |
+----+-------------------------------------------------+
1 row in set (0.00 sec)
So the behavior is consistent. And for the JSON format, I understand that the order of display doesn’t matter as long as the content is the same. 
Thank you for the reply.
I just tried your example, and indeed the order is consistent between MySQL and TiDB. However, I simulated a new data entry:
insert into test1(json_info) values(‘{“cccc”: 111, “aaaaa”: 30, “ddd”: “Beijing”}’);
On the MySQL side:
On the TiDB side:
The order is still messed up.
In the JSON standard, key-value pairs themselves are unordered, but they appear to be ordered when viewed using database tools. It should be possible to use fastjson for serialization when the application reads the data.
Even though it’s sorted, theoretically, isn’t fetching data from JSON also significantly affected by KV?
The write order and the display order in MySQL are not the same.
It doesn’t guarantee the order itself; otherwise, you should store it as an array if you need the order. https://dev.mysql.com/doc/refman/8.0/en/json.html