Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tiflash不支持sort算子
When using TiFlash, I found that it never uses the sort operator, but it can use the topN operator with a limit. Can the sort capability of TiFlash be enhanced to avoid sorting on the TiDB server?
When multiple nodes use the sort operator, doesn’t it need to be re-sorted during merging?
Resorting can’t be done in TiFlash? Do you have any idea how slow sorting in TiDB is?
It is generally used in conjunction with the limit operator.
Haven’t you noticed that TiKV is like this too? 
Sort operators are generally not pushable and are executed at the computation layer in TiKV or TiFlash.
Therefore, there are corresponding optimizations. When a SQL statement containing Sort causes memory OOM, TiDB will trigger disk spilling by default for targeted optimization.
The ORDER BY clause corresponds to the Sort operator node in the query plan tree. The system combines adjacent Limit and Sort operators into a TopN operator node, which represents extracting the top N records according to a certain sorting rule. The Limit node is equivalent to a TopN node with an empty sorting rule.
This is not considered a bug, but rather a requirement. TiFlash is an MPP engine, and unless it has independent computing capabilities to complete aggregation and sorting internally, there is no need for the upper layer TiDB to handle this.
I think this requirement is quite reasonable.
The situation with TiFlash and TiKV is different. In MPP mode, TiFlash can exchange data with each other and should have global sorting capabilities.
There is no need to send it to TiDB for processing.
Of course, I am not sure how much performance regression this would cause for sorting. If the sorting performance of TiFlash+MPP is worse than TiDB, there is a reason not to do this.