TiDB 7.5 Unable to Convert String, Behavior Inconsistent with MySQL

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 7.5 无法转换string,行为与mysql不一致

| username: ShawnYan

[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Log/Monitoring]

Issue Summary:
In TiDB 7.5, the test statement reports an error when executed under the utf8mb4 character set, but it works fine under latin1. This behavior is inconsistent with MySQL/MariaDB. Please refer to the specific example below.

TiDB

Test example as follows,

-- tidb 7.5, utf8mb4 (default), reports error
set names utf8mb4;
select cast(compress('b') as char);

-- tidb 7.5, latin1, works fine
set names latin1;
select cast(compress('b') as char);

output:

MySQL [test]> select version();
+--------------------+
| version()          |
+--------------------+
| 8.0.11-TiDB-v7.5.0 |
+--------------------+
1 row in set (0.001 sec)

MySQL [test]> set names utf8mb4;
Query OK, 0 rows affected (0.000 sec)

MySQL [test]> select cast(compress('b') as char);
ERROR 3854 (HY000): Cannot convert string '
MySQL [test]> set names latin1;
Query OK, 0 rows affected (0.000 sec)

MySQL [test]> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒J  ▒▒ c c              |
+-----------------------------+
1 row in set (0.001 sec)

MySQL

Compared to MySQL, versions 8.2/8.0 behave consistently, returning results for this statement, but returning null and a warning under utf8mb4 instead of an error.
In MySQL 5.7, both utf8mb4/latin1 return results normally.

  • 8
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.32    |
+-----------+
1 row in set (0.00 sec)

mysql> set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)

mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
| NULL                        |
+-----------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+--------------------------------------------+
| Level   | Code | Message                                    |
+---------+------+--------------------------------------------+
| Warning | 1300 | Invalid utf8mb4 character string: '9C4B02' |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)

mysql> set names latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c               |
+-----------------------------+
1 row in set (0.00 sec)
  • 5.7
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.38    |
+-----------+
1 row in set (0.00 sec)

mysql> set names utf8mb4;
Query OK, 0 rows affected (0.00 sec)

mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c               |
+-----------------------------+
1 row in set (0.00 sec)

mysql> set names latin1;
Query OK, 0 rows affected (0.00 sec)

mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c               |
+-----------------------------+
1 row in set (0.00 sec)

MariaDB

In MariaDB, the results are consistent for utf8mb4/latin1,

MariaDB [(none)]> set names utf8mb4;
Query OK, 0 rows affected (0.002 sec)

MariaDB [(none)]> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x?K  c c               |
+-----------------------------+
1 row in set, 1 warning (0.369 sec)

MariaDB [(none)]> show warnings;
+---------+------+-------------------------------------------------------------+
| Level   | Code | Message                                                     |
+---------+------+-------------------------------------------------------------+
| Warning | 1300 | Invalid utf8mb4 character string: '\x9CK\x02\x00\x00c\x00c' |
+---------+------+-------------------------------------------------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> set names latin1;
Query OK, 0 rows affected (0.000 sec)

MariaDB [(none)]> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c               |
+-----------------------------+
1 row in set (0.000 sec)

MariaDB [(none)]> select version();
+-------------------------------------+
| version()                           |
+-------------------------------------+
| 10.7.3-MariaDB-1:10.7.3+maria~focal |
+-------------------------------------+
1 row in set (0.000 sec)
| username: 像风一样的男子 | Original post link

This should be a MySQL issue. In version 8.0, the COMPRESS function returns a compressed string that cannot be converted to char. Did MySQL version 8.2 change this secretly?

| username: ShawnYan | Original post link

5.7/8.0 test results are the same,

mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c               |
+-----------------------------+
1 row in set (0.02 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.32    |
+-----------+
1 row in set (0.00 sec)


mysql> select cast(compress('b') as char);
+-----------------------------+
| cast(compress('b') as char) |
+-----------------------------+
|    x▒K  c c                |
+-----------------------------+
1 row in set (0.00 sec)

mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.31-log |
+------------+
1 row in set (0.00 sec)
| username: 像风一样的男子 | Original post link

It really is. I tried it with MySQL 5.7 and it worked fine. TiDB reported an error.

| username: dba远航 | Original post link

This is a version support issue. The new version supports it, while the old version does not.

| username: zhanggame1 | Original post link

Tested it, TiDB is reporting an error.

| username: tidb菜鸟一只 | Original post link

Indeed, TiDB versions all report errors, while MySQL 5.7 to 8 can return results.

| username: ShawnYan | Original post link

Thank you, Wenjun, for your explanation.

| username: Cong-PingCAP | Original post link

This ‘can return’ is also meaningless. The result obtained from Compress itself is an illegal UTF8 encoding. It’s just that MySQL reported a warning, while TiDB reported an error.

| username: ShawnYan | Original post link

Hello, thank you for your reply.

error/warn are two different warning levels, and the way to catch them will be different. This is equivalent to another incompatibility between TiDB and MySQL.

| username: xingzhenxiang | Original post link

It seems like others have raised similar questions before.

| username: ShawnYan | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.