Why is there no seek_factor in the case of tidb_cost_model_version=2?

translator_bot · June 22, 2024, 5:02am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb_cost_model_version=2情况下为何没有seek_factor

| username: 人如其名

【TiDB Usage Environment】Testing/
【TiDB Version】v6.5.3

Under the condition of tidb_cost_model_version=1, there is tidb_opt_seek_factor=20 (seekFactor is the IO cost of seeking the start value of a range in TiKV or TiFlash.), which means the cost of the first data lookup when obtaining a copTask (or paging) (I personally think this value is still relatively small). However, there are two issues with this parameter under tidb_cost_model_version=2:

No related parameters can be adjusted (parameters can be adjusted under tidb_cost_model_version=1), it is written in the code configuration, see: tidb/planner/core/plan_cost_ver2.go at a7b54adfede165328fab966e288d2d9402943d7c · pingcap/tidb · GitHub
No seek_factor evaluation factor, during the process of index back table query, multiple ranges are organized into multiple ranges to query back the table through reading tidb_index_lookup_size keys at a time. Since looking up the row data in the table based on the handle (key value) is a logical lookup (different from the physical lookup of traditional databases), at least one seek lookup process will occur, and the cost is much higher compared to the next scan. Additionally, I personally feel that even if there is a seek_factor, it should not be evaluated as a single seek cost for a range as in cost_model=1, but rather estimate the number of seeks based on the “distance” of the keys in the range to make a more accurate model evaluation. Refer to: 执行计划问题-Index Lookup Join效率低下 - TiDB 的问答社区, in this post, a large number of seek operations in TiKV caused by index back table queries led to reading a large number of blocks and poor performance.

Therefore, I would like to ask:

Will the evaluation factors of tidb_cost_model_version=2 be opened up as parameters in future versions, allowing customers to adjust them in extreme cases?
Without the seek_factor evaluation factor under tidb_cost_model_version=2, will it lead to inaccurate evaluation of index back table queries, resulting in choosing indexlookupjoin instead of hashjoin, leading to inefficiency? Will you consider adding this evaluation factor in the future?

translator_bot · June 22, 2024, 5:02am

| username: zhanggame1 | Original post link

Support for increased adjustability

translator_bot · June 22, 2024, 5:02am

| username: redgame | Original post link

According to the habits of other similar software, there will be hidden parameters.