Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 可见版本数过多会导致查询变慢么?
[TiDB Usage Environment] Production Environment
[TiDB Version] v5.1.0
A simple aggregate query can hit the index, but the query time exceeds 1 minute. The GC setting is 24h.
Can too many replicas cause slow queries?
select aa, sum(c_amount) from table
where c_date > '2023-05-01'
and c_date < '2023-07-01'
group by aa
order by aa
limit 10
Yes, when reading from a large amount of data, needing to skip can slow down the get speed.
If necessary, you can manually release the version.
I also found another problem, why was hashagg used twice 
Once on TiKV, once on TiDB
Each TiKV node gets the grouping results, then aggregates them on TiDB for grouping and sorting…
This ensures the correct results are obtained. (After all, the data is distributed across various TiKV nodes and cannot be grouped and aggregated in one go.)
So why did it perform an IndexRangeScan once and then a TableRowIDScan once?
Of course, it will be slower. TiDB’s MVCC puts historical data and current data together.
TableRangeScan: Table data scan with a range.
TableRowIDScan: Scans table data based on the RowID passed down from the upper layer. Often used to retrieve rows that meet the criteria after an index read operation.
Specifically, it means finding the specific rowid through the index on the c_date column, and then going back to the table to fetch data using the rowid.
With 13 million data entries and no garbage collection, it won’t be fast.
I forcibly used TiFlash, and the time reduced from 1 minute to 7 seconds. It seems that I still have to forcibly use TiFlash.
The first time is group by, the second time is limit. After hashagg, immediately topn, which is limit.
Got it, we’ve decided to use TiFlash.
How to manually release a version?
Logically, the data has been deleted, and this operation must be performed.
Then, manually control the GC of TiKV to achieve the release.
Reference documentation:
Is there a composite index on date and aa?
No, only c_date has an index.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.