TiFlash 7.5.1 Internal: Can't find a proper physical plan for this query

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tiflash7.5.1 Internal : Can’t find a proper physical plan for this query

| username: 何明_亿玛

[TiDB Usage Environment] Production Environment
[TiDB Version]
7.5.1
mysql> SELECT @@tidb_enforce_mpp;
| @@tidb_enforce_mpp |
| 1 |
1 row in set (0.00 sec)

mysql> SELECT @@tidb_allow_mpp;
| @@tidb_allow_mpp |
| 1 |
1 row in set (0.00 sec)

mysql> SELECT @@tidb_opt_enable_hash_join;
| @@tidb_opt_enable_hash_join |
| 1 |

[Reproduction Path] Operations performed that led to the issue
EXPLAIN SELECT /+read_from_storage(tiflash[synrpt_tt_promotion])/
COUNT(DISTINCT promotion_id) AS ad_id,
advertiser_id
FROM
synrpt_tt_promotion
WHERE
DATE = ‘2024-05-15’
AND stat_cost > 0
GROUP BY
advertiser_id
Error Code: 1815
Internal: Can’t find a proper physical plan for this query

[Encountered Problem: Phenomenon and Impact]
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: 何明_亿玛 | Original post link

COUNT(DISTINCT promotion_id) AS ad_id is seen as a slow query running on TiKV. When I try to force it to run on TiFlash, it reports an error.

count(1) runs normally.

COUNT(DISTINCT) on other tables also runs normally.

| username: songxuecheng | Original post link

Check the status of this table in TiFlash.

| username: foxchan | Original post link

The TiFlash replica is normal.

| username: Lloyd-Pottiger | Original post link

Open this and take a look.

| username: 何明_亿玛 | Original post link

AND stat_cost > 0
This field is defined as a string, which leads to implicit type conversion. stat_cost VARCHAR(10) DEFAULT NULL COMMENT ‘Represents the estimated cost of the advertisement during the campaign period. The data for the day may fluctuate and stabilize the next day.’

Changing it to AND stat_cost > ‘0’ works fine, but this is a bug. It would be best if the system could fix it to avoid triggering bugs due to implicit conversion, which makes the SQL unusable.

| username: 何明_亿玛 | Original post link

Strangely, the same table in my online yixintui database reports an error when forcing TiFlash. However, after renaming it to the test database, forcing TiFlash does not report an error and follows the TiKV execution plan.

| username: 芝士改变命运 | Original post link

Look, look.

| username: windtaker | Original post link

May I ask if setting tidb_allow_tiflash_cop to true will still cause an error when using AND stat_cost > 0?

| username: 何明_亿玛 | Original post link

After setting tidb_allow_tiflash_cop to true, there are no more errors, and the execution plan is correct.

Without forcing TiFlash, it can still use TiFlash.

| username: 友利奈绪 | Original post link

Good, good.

| username: windtaker | Original post link

Could you share the schema of this table? I want to see if I can reproduce the issue.

| username: guo-shaoge | Original post link

If convenient, you can use the plan replayer, which provides more complete information: 使用 PLAN REPLAYER 保存和恢复集群现场信息 | PingCAP 文档中心

| username: 何明_亿玛 | Original post link

CREATE TABLE synrpt_tt_promotion (
advertiser_id bigint(20) NOT NULL COMMENT ‘advertiser account id’,
project_id bigint(20) NOT NULL COMMENT ‘project ID’,
promotion_id bigint(32) NOT NULL COMMENT ‘advertisement ID’,
date char(10) NOT NULL COMMENT ‘date’,
hour int(2) NOT NULL COMMENT ‘hour’,
promotion_name varchar(255) DEFAULT NULL COMMENT ‘advertisement plan name’,
stat_cost varchar(10) DEFAULT NULL COMMENT ‘indicates the estimated cost of the advertisement during the delivery period. The data of the day may fluctuate and stabilize the next day’,
show_cnt varchar(10) DEFAULT NULL COMMENT ‘the number of times the advertisement is shown to users. Calculation method: the number of valid and billed displays determined by the platform.’,
cpm_platform varchar(10) DEFAULT NULL COMMENT ‘the cost per thousand impressions of the advertisement, calculated as: total cost / number of impressions * 1000.’,
click_cnt varchar(10) DEFAULT NULL COMMENT ‘when a user clicks on the advertisement material, a click event is triggered, which is considered a valid advertisement click.’,
ctr varchar(10) DEFAULT NULL COMMENT ‘the percentage of times the advertisement is clicked compared to the number of impressions. Calculation method: number of clicks / number of impressions * 100%’,

create_time datetime DEFAULT CURRENT_TIMESTAMP,
sync_time datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (advertiser_id,project_id,promotion_id,date,hour) /*T![clustered_index] NONCLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T! SHARD_ROW_ID_BITS=4 PRE_SPLIT_REGIONS=2 */ COMMENT=‘Hourly report data of the trial version advertisement’
PARTITION BY RANGE COLUMNS(date)
(

PARTITION p20240415 VALUES LESS THAN (“2024-04-16”),
PARTITION p20240416 VALUES LESS THAN (“2024-04-17”),
PARTITION p20240417 VALUES LESS THAN (“2024-04-18”),
PARTITION p20240418 VALUES LESS THAN (“2024-04-19”),
PARTITION p20240419 VALUES LESS THAN (“2024-04-20”),
PARTITION p20240420 VALUES LESS THAN (“2024-04-21”),
PARTITION p20240421 VALUES LESS THAN (“2024-04-22”),
PARTITION p20240422 VALUES LESS THAN (“2024-04-23”),
PARTITION p20240423 VALUES LESS THAN (“2024-04-24”),
PARTITION p20240424 VALUES LESS THAN (“2024-04-25”),
PARTITION p20240425 VALUES LESS THAN (“2024-04-26”),
PARTITION p20240426 VALUES LESS THAN (“2024-04-27”),
PARTITION p20240427 VALUES LESS THAN (“2024-04-28”),
PARTITION p20240428 VALUES LESS THAN (‘2024-04-29’),
PARTITION p20240429 VALUES LESS THAN (‘2024-04-30’),
PARTITION p20240430 VALUES LESS THAN (‘2024-05-01’),
PARTITION p20240501 VALUES LESS THAN (‘2024-05-02’),
PARTITION p20240502 VALUES LESS THAN (‘2024-05-03’),
PARTITION p20240503 VALUES LESS THAN (‘2024-05-04’),
PARTITION p20240504 VALUES LESS THAN (‘2024-05-05’),
PARTITION p20240505 VALUES LESS THAN (‘2024-05-06’),
PARTITION p20240506 VALUES LESS THAN (‘2024-05-07’),
PARTITION p20240507 VALUES LESS THAN (‘2024-05-08’),
PARTITION p20240508 VALUES LESS THAN (‘2024-05-09’),
PARTITION p20240509 VALUES LESS THAN (‘2024-05-10’),
PARTITION p20240510 VALUES LESS THAN (‘2024-05-11’),
PARTITION p20240511 VALUES LESS THAN (‘2024-05-12’),
PARTITION p20240512 VALUES LESS THAN (‘2024-05-13’),
PARTITION p20240513 VALUES LESS THAN (‘2024-05-14’),
PARTITION p20240514 VALUES LESS THAN (‘2024-05-15’),
PARTITION p20240515 VALUES LESS THAN (‘2024-05-16’),
PARTITION p20240516 VALUES LESS THAN (‘2024-05-17’),
PARTITION p20240517 VALUES LESS THAN (‘2024-05-18’),
PARTITION p20240518 VALUES LESS THAN (‘2024-05-19’),
PARTITION p20240519 VALUES LESS THAN (‘2024-05-20’),
PARTITION p20240520 VALUES LESS THAN (‘2024-05-21’),
PARTITION p20240521 VALUES LESS THAN (‘2024-05-22’))

| username: guo-shaoge | Original post link

Could you upload the complete plan replayer? The issue couldn’t be reproduced with just the schema. For usage instructions, see: https://docs.pingcap.com/zh/tidb/stable/sql-plan-replayer @He Ming_Yi Ma

| username: 何明_亿玛 | Original post link

Still the same error.

image

| username: 何明_亿玛 | Original post link

After changing back to tidb_allow_tiflash_cop=1, replayer.zip 198M upload error

| username: 何明_亿玛 | Original post link

The statistical information was too large to upload. I removed the statistical information, repackaged it, and uploaded it.