Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.Original topic: TiDB在“搜索条件具有不定性、模糊性、(极)低区分度的大范围数据分页查询业务”的表现具体如何?
Scenario Description: Pagination query business for large-scale data with indeterminate, fuzzy, and (extremely) low differentiation search conditions.
Example: Over three years, a table gradually (smoothly over time) generated 100 million records. Now, we need to paginate query detailed data for a specific half-year period (e.g., from 2022-03-01 to 2022-08-01). If using MySQL to support this, it would query based on “creation time” with “creation time” as an index. However, because the query condition “creation time” spans a half-year range, which is too large and lacks differentiation, the index becomes “invalid,” leading to a full table scan and increased CPU usage. If multiple people query this SQL simultaneously, the database CPU usage might approach 100%, severely affecting the usage of other functions. Generally, we would synchronize the data to Elasticsearch and use ES’s retrieval capabilities to meet the above requirements.
Question: If MySQL is replaced with TiDB, can it directly meet the above scenario? How is its performance in this scenario?