Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 大小写敏感的设置问题
[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.3
[Encountered Problem: Problem Phenomenon and Impact]
The business side requires case sensitivity, i.e., case distinction!
Dear experts, does TiDB have a case sensitivity setting for SQL queries? For example, is there a difference between select * from a1
and select * from A1
? Are there any related parameters to enable or disable this sensitivity? Or is it related to the character set?
The description is incorrect; it should be the field value, for example:
mysql> select * from t where name = 'A';
+----+------+-----+
| id | name | age |
+----+------+-----+
| 1 | a | 10 |
| 2 | A | 11 |
+----+------+-----+
2 rows in set (0.01 sec)
mysql> select * from t where name = 'a';
+----+------+-----+
| id | name | age |
+----+------+-----+
| 1 | a | 10 |
| 2 | A | 11 |
+----+------+-----+
2 rows in set (0.02 sec)
It should be changed to the following result:
mysql> select * from t where name = 'a';
+----+------+-----+
| id | name | age |
+----+------+-----+
| 1 | a | 10 |
+----+------+-----+
1 row in set (0.00 sec)
mysql> select * from t where name = 'A';
+----+------+-----+
| id | name | age |
+----+------+-----+
| 2 | A | 11 |
+----+------+-----+
1 row in set (0.01 sec)
The default value is not distinguished, you need to change it yourself.
What is the collation of the table?
ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
It is related to collation
Is there a difference from MySQL?
Collate (collation) rules:
*_bin: indicates binary case sensitive collation, which means it is case sensitive
*_cs: case sensitive collation, case sensitive
*_ci: case insensitive collation, not case sensitive
What I understand is that the rule suffix should be consistent, only the default values are different.
TiDB uses binary collation by default. This is different from MySQL, which uses a case-insensitive collation by default.
Use the default utf8mb4_bin collation to distinguish between uppercase and lowercase.
I just tested it, and it seems that if the MySQL table creation statement does not specify a collation, it will not distinguish between uppercase and lowercase field values by default. TiDB, on the other hand, has a default collation and distinguishes between uppercase and lowercase by default.
The configuration switch new_collations_enabled_on_first_bootstrap
can control whether the TiDB cluster is case-sensitive. By default, it is case-sensitive. This is determined during the initial initialization of the cluster (compatible with MySQL’s default case-insensitivity). If you need to enable the new collation framework, set the value of new_collations_enabled_on_first_bootstrap
to true
.
To achieve compatibility on TiDB, the cluster needs to be rebuilt. This parameter can only be configured during the initial initialization of the cluster.
Yes, we encountered it when creating a new cluster. However, generally, case sensitivity is correct and should be rigorous. 
Yes, the key is that some column values do have case sensitivity. 
What is the collation of the table?
The default utf8_bin is case-sensitive.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.