TiFlash does not support the sort operator

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiflash不支持sort算子

| username: 人如其名

When using TiFlash, I found that it never uses the sort operator, but it can use the topN operator with a limit. Can the sort capability of TiFlash be enhanced to avoid sorting on the TiDB server?

| username: tidb菜鸟一只 | Original post link

When multiple nodes use the sort operator, doesn’t it need to be re-sorted during merging?

| username: TiDBer_b3Gr5aVA | Original post link

Resorting can’t be done in TiFlash? Do you have any idea how slow sorting in TiDB is?

| username: paulli | Original post link

It is generally used in conjunction with the limit operator.

| username: 小龙虾爱大龙虾 | Original post link

Haven’t you noticed that TiKV is like this too? :smiley:

| username: Jellybean | Original post link

Sort operators are generally not pushable and are executed at the computation layer in TiKV or TiFlash.

Therefore, there are corresponding optimizations. When a SQL statement containing Sort causes memory OOM, TiDB will trigger disk spilling by default for targeted optimization.

The ORDER BY clause corresponds to the Sort operator node in the query plan tree. The system combines adjacent Limit and Sort operators into a TopN operator node, which represents extracting the top N records according to a certain sorting rule. The Limit node is equivalent to a TopN node with an empty sorting rule.

| username: redgame | Original post link

Can’t go this way…

| username: xfworld | Original post link

This is not considered a bug, but rather a requirement. TiFlash is an MPP engine, and unless it has independent computing capabilities to complete aggregation and sorting internally, there is no need for the upper layer TiDB to handle this.

| username: 有猫万事足 | Original post link

I think this requirement is quite reasonable.
The situation with TiFlash and TiKV is different. In MPP mode, TiFlash can exchange data with each other and should have global sorting capabilities.
There is no need to send it to TiDB for processing.
Of course, I am not sure how much performance regression this would cause for sorting. If the sorting performance of TiFlash+MPP is worse than TiDB, there is a reason not to do this.