Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: strcmp函数的问题
[TiDB Usage Environment] Poc
[TiDB Version] 8.0.11-TiDB-v7.4.0
[Problem Encountered: strcmp function]

MySQL [test]> select strcmp(‘ABC’,‘ABc’);
±--------------------+
| strcmp(‘ABC’,‘ABc’) |
±--------------------+
| 0 |
±--------------------+
1 row in set (0.00 sec)
Question: In TiDB-v7.4.0, are the characters ‘ABC’ and ‘ABc’ considered equal?
I see TiDB and MySQL perform the same 
Then it’s different from the tutorial. Did I make a mistake somewhere?
The TiDB in the tutorial is version V6.1.
Take a look at the character validation rules.
The default configuration for 7.1.1 is -1.
Check what the default collation of the database is.
This is related to character set collation. I have encountered a similar issue before. Using utf8mb4_bin does not have this problem.
TiDB is case-sensitive by default. When you directly use CREATE DATABASE xxx
, it uses the default settings. Using CREATE DATABASE xxx SET utf8mb4 COLLATE utf8mb4_general_ci;
will result in differences.
Run SELECT @@character_set_database, @@collation_database;
to check what is currently being used.
I can’t view images directly. Please provide the text you need translated.
This is related to character set collation, please see:
MySQL [test]> SET NAMES utf8mb4 COLLATE utf8mb4_bin;
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> select STRCMP('ABC','ABc');
+---------------------+
| STRCMP('ABC','ABc') |
+---------------------+
| -1 |
+---------------------+
1 row in set (0.00 sec)
MySQL [test]> SET NAMES utf8mb4 COLLATE utf8mb4_general_ci;
Query OK, 0 rows affected (0.00 sec)
MySQL [test]> select STRCMP('ABC','ABc');
+---------------------+
| STRCMP('ABC','ABc') |
+---------------------+
| 0 |
+---------------------+
1 row in set (0.00 sec)
MySQL [test]> select version();
+--------------------+
| version() |
+--------------------+
| 5.7.25-TiDB-v6.5.4 |
+--------------------+
1 row in set (0.00 sec)
Character set and collation reference: 字符集和排序规则 | PingCAP 文档中心
Collation precedence reference: https://dev.mysql.com/doc/refman/5.7/en/charset-collation-coercibility.html
According to the precedence rules, the collation used here is the session-level collation.
The results seem to be different.
The variables you are looking at are the default character set and collation of the database. The direct comparison of two texts does not use this. You need to carefully read this article to clarify the character set and collation used in various situations. Refer to: 字符集和排序规则 | PingCAP 文档中心
This is the correct answer. My client’s sorting rule defaults to utf8_general_ci;

After setting it as the default, it meets expectations.

To be honest, TiDB’s documentation is quite confusing. I had to refer to the MySQL manual to get some clarity.
Thanks to everyone above for the help.
Whether to ignore case in collation will affect LIKE matching and string comparison.
Take a look at the character set.
Check whether the character set and collation were specified when the database was created, as well as the character set configuration file.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.