[TiFlash] How does the cost model choose between TiKV and TiFlash? What is the algorithm?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 【tiflash】代价模型是如何选择tikv还是tiflash的?算法是什么?

| username: TiDBer_abThS2LT

【TiDB Usage Environment】Production environment or Test environment or POC
Test environment.

【TiDB Version】
Latest version.

【Encountered Problem】

【tiflash】How does the cost model choose between tikv and tiflash? What is the algorithm? Is there any related documentation?

【Reproduction Path】What operations were performed that caused the problem
【Problem Phenomenon and Impact】

【Attachments】 Related logs and monitoring (https://metricstool.pingcap.com/)


If the question is about performance optimization or troubleshooting, please download the script and run it. Please select all and copy-paste the terminal output results for upload.

| username: xiaohetao | Original post link

After TiDB parses the SQL, the optimizer interprets and generates the execution plan. You can see which algorithms the TiDB optimizer uses.

| username: gary | Original post link

Intelligent selection (CBO automatic or manual selection)

| username: alfred | Original post link

The core function of the optimizer is to select the execution plan with the lowest cost. However, the optimizer itself is not omnipotent and may choose the wrong execution plan due to certain factors. In such cases, manual intervention is required.

| username: 特雷西-迈克-格雷迪 | Original post link

The statistics of the table provide the basis for your optimizer to calculate. Choosing TiFlash or TiKV mainly depends on the amount of data in the table and the number of rows and columns you query. To understand the algorithm, directly look at the source code.