When querying non-existent data in a TiDB table, executing "select x from xxx order by xxx limit 10, offset 0" results in a full table scan, causing slow query performance

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb 查询表中不存在的数据时,执行 select x from xxx order by xxx limit 10, offset 0 会变成全部扫描,查询缓慢

| username: Hacker_b0cygNCC

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】
【Reproduction Path】What operations were performed when the issue occurred
【Encountered Issue: Issue Phenomenon and Impact】
【Resource Configuration】
【Attachments: Screenshots / Logs / Monitoring】

| username: 大飞哥online | Original post link

I’ll go test it in the environment.

| username: Hacker_b0cygNCC | Original post link

How to solve this problem? When querying, it is not known whether it exists. If you have to check for existence before each paginated query, it is very troublesome and violates normal user habits.

| username: Hacker_b0cygNCC | Original post link

Removing limit and order can return results in seconds. However, when adding order and limit to non-existent data, the execution engine goes off track.

| username: 大飞哥online | Original post link

My table has 2 million rows. I queried a non-existent book_id and did not encounter your issue.

| username: 大飞哥online | Original post link

Try running ANALYZE TABLE and then check again.

| username: 大飞哥online | Original post link

My version is v7.1.0.

| username: Hacker_b0cygNCC | Original post link

I am also using version 7.1.0
image


Just over 1.1 million records, execution takes 2.2 seconds,

All of these are slow queries here.

| username: Hacker_b0cygNCC | Original post link

= Filtering also times out

| username: 大飞哥online | Original post link

Show the table structure.

| username: Hacker_b0cygNCC | Original post link

Your query’s order by has an issue. Your order by user_id should be minimal. My order by is by time, time. Using an order by with a minimal field is also fast for me.

| username: Kongdom | Original post link

Are there any hotspots or DDL blockages?

| username: Hacker_b0cygNCC | Original post link

create_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ‘Data creation time’,
update_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT ‘Data update time’ Just try ordering by time and you’ll know.

| username: Hacker_b0cygNCC | Original post link

The QPS is within 5, and no one is accessing it. It means the TiDB execution engine is malfunctioning.

| username: 大飞哥online | Original post link

Ordering by rated_at is also quite fast.

| username: Hacker_b0cygNCC | Original post link

CREATE TABLE `SyncLogRecord` (
  `a` bigint(20) unsigned NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */,
  `b` bigint(20) NOT NULL,
  `c` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `d` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `e` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT 'video',
  `f` int(11) NOT NULL,
  `g` int(11) NOT NULL,
  `h` int(11) NOT NULL,
  `i` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `j` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `k` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `l` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `m` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `n` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `o` int(11) NOT NULL,
  `p` int(11) NOT NULL,
  `q` text COLLATE utf8mb4_general_ci NOT NULL,
  `r` varchar(255) COLLATE utf8mb4_general_ci NOT NULL,
  `s` int(11) NOT NULL,
  `t` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `u` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `v` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `w` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `x` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `y` varchar(255) COLLATE utf8mb4_general_ci NOT NULL DEFAULT '',
  `z` text COLLATE utf8mb4_general_ci NOT NULL,
  `aa` smallint(6) NOT NULL DEFAULT '0',
  `ab` int(11) DEFAULT NULL,
  `ac` varchar(255) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `ad` varchar(255) COLLATE utf8mb4_general_ci DEFAULT NULL,
  `ae` int(11) NOT NULL DEFAULT '0',
  `af` tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`a`) /*T![clustered_index] CLUSTERED */,
  KEY `logrecord_b` (`b`),
  KEY `logrecord_c` (`c`),
  KEY `logrecord_g` (`g`),
  KEY `logrecord_h` (`h`),
  KEY `logrecord_l` (`l`),
  KEY `logrecore_i` (`i`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
| username: Hacker_b0cygNCC | Original post link

It seems to be a bug in TiDB’s time parsing. TiDB doesn’t know that datetime can be compared. When I use other fields in the table, such as int type order and string type order, other columns can return results quickly.

| username: Hacker_b0cygNCC | Original post link

I think the problem is that the tidb-server process is not running. You can check the status of the tidb-server process using the ps command. If it is not running, you can start it using the systemctl start tidb command.

| username: Hacker_b0cygNCC | Original post link

For this simplest homepage list requirement, sorted in reverse chronological order, querying just one field, with 1 million records, both order by and limit offset have issues. This needs to be resolved quickly; without resolving it, it’s completely useless.

| username: Hacker_b0cygNCC | Original post link

Can’t explain it clearly, create_at doesn’t work, but update_at does. create_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT ‘Data creation time’, update_at datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT ‘Data update time’. This is definitely a TiDB bug, the time is filled by the database.