Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: REGEXP 正则表达式匹配, 不能识别 [[:<:]], [[:>:]]这样的单词定界符
[TiDB Usage Environment] Production Environment / Testing / Poc
[TiDB Version] V6.6.0
[Encountered Problem: Problem Phenomenon and Impact]
Executing SQL statement: SELECT ‘a word a’ REGEXP ‘[[:<:]]word[[:>:]]’;
Error: SQL Error [1105] [HY000]: error parsing regexp: invalid character class range: [:<:]
However, it executes correctly in MySQL 5.7:
mysql> SELECT ‘a word a’ REGEXP ‘[[:<:]]word[[:>:]]’; → 1
Try adding escape characters?
TiDB no longer supports the [[:<:]] character class in regular expressions. Instead, it uses the \b metacharacter, which represents a word boundary. Specifically, the \b metacharacter is used for position matching that meets certain criteria, such as matching the beginning and end of words, or the side of numbers and letters. For example, to find content in a string that starts with the word “word”, you can use the regular expression ‘\bword’ for matching.
Your example can be implemented as follows:
SELECT 'a wordx a' REGEXP '\\bword\\b'; → 0
SELECT 'a word a' REGEXP '\\bword\\b'; → 1
SELECT 'a xword a' REGEXP '\\bword\\b'; → 0
Thank you!
After execution, there are no errors, but it seems that the correct result cannot be obtained.
The forum displays two backslashes as one. Change the backslash b in the quotes to two backslashes b.