Execution plan cop_task max: 2.03s, significantly longer than others, why is that?

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 执行计划cop_task max: 2.03s,明显大于其他的,为神马呢

| username: 大飞飞呀

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Problem Phenomenon and Impact]
Querying by unique key, the execution plan takes a long time, nearly 3 seconds,
The execution plan shows Concurrency: OFF

	id                     	task     	estRows	operator info                                                                                                                                                                                                                                                                                 	actRows	execution info                                                                                                                                                                                                                                                                                                                                                	memory 	disk
	Projection_7           	root     	1.00   	stat.table_abc.i_id, stat.table_abc.i_date, stat.table_abc.i_member_id, stat.table_abc.i_count, stat.table_abc.i_amount, stat.table_abc.ch_rank_list	1      	time:2.78s, loops:2, Concurrency:OFF                                                                                                                                                                                                                                                                                                                          	3.43 KB	N/A
	└─TopN_10              	root     	1.00   	stat.table_abc.i_id:desc, offset:0, count:1000                                                                                                                                                                                                                         	1      	time:2.78s, loops:2                                                                                                                                                                                                                                                                                                                                           	3.44 KB	N/A
	  └─IndexLookUp_28     	root     	1.00   	                                                                                                                                                                                                                                                                                              	1      	time:2.78s, loops:3, index_task: {total_time: 706.6ms, fetch_handle: 706.6ms, build: 1.02µs, wait: 2.21µs}, table_task: {total_time: 7.72s, num: 1, concurrency: 8}                                                                                                                                                                                         	11.3 KB	N/A
	    ├─IndexRangeScan_26	cop[tikv]	1.00   	table:table_abc, index:UNIQ_I_MEMBER_ID(I_MEMBER_ID, I_DATE), range:[813604959883265,813604959883265], keep order:false                                                                                                                                          	1      	time:706.5ms, loops:3, cop_task: {num: 1, max: 706.4ms, proc_keys: 1, tot_proc: 1ms, rpc_num: 1, rpc_time: 706.4ms, copr_cache_hit_ratio: 0.00}, tikv_task:{time:1ms, loops:1}, scan_detail: {total_process_keys: 1, total_keys: 2, rocksdb: {delete_skipped_count: 0, key_skipped_count: 1, block: {cache_hit_count: 11, read_count: 0, read_byte: 0 Bytes}}}	N/A    	N/A
	    └─TableRowIDScan_27	cop[tikv]	1.00   	table:table_abc, keep order:false                                                                                                                                                                                                                                      	1      	time:2.07s, loops:2, cop_task: {num: 1, max: 2.03s, proc_keys: 1, rpc_num: 1, rpc_time: 2.03s, copr_cache_hit_ratio: 0.00}, tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 1, total_keys: 1, rocksdb: {delete_skipped_count: 0, key_skipped_count: 0, block: {cache_hit_count: 16, read_count: 0, read_byte: 0 Bytes}}}                      	N/A    	N/A

[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: 有猫万事足 | Original post link

The high rpc_time is the issue.

| username: cassblanca | Original post link

The communication delay between nodes is relatively high. The higher the value of RPC Time, the greater the communication delay between nodes, which may affect the query response time and performance.

| username: redgame | Original post link

Index issues, statistics issues

| username: h5n1 | Original post link

  1. Concurrency: OFF Check the relevant parameter settings: show variables like ‘%concurrency%’;
  2. Check if mysql.opt_rule_blacklist has any content. Add this hint to the SQL /*+ LIMIT_TO_COP() */ and try again.
  3. Reanalyze the table statistics.
| username: 大飞飞呀 | Original post link

I found the main problem is that a certain cop_task is obviously timing out. How can this be resolved?

| username: h5n1 | Original post link

Is this stably reproducible?

| username: tidb菜鸟一只 | Original post link

The main reason for cop_task being slow is due to TiKV. First, check the load on the TiKV nodes to see if any nodes have significantly higher loads.

| username: tidb狂热爱好者 | Original post link

“cop_task being slow is mainly due to TiKV. First, check the load on the TiKV nodes to see if any nodes have significantly higher loads.”

“Little rookie, your technical skills are clearly very high. You hit the nail on the head with one sentence.”

| username: 大飞飞呀 | Original post link

Unstable, the issue is instability, it happens occasionally.

| username: 大飞飞呀 | Original post link

Can you see which TiKV has a higher load through the execution plan?

| username: h5n1 | Original post link

Not all SQL statements are affected when it happens, right?

| username: tidb菜鸟一只 | Original post link

Check the load situation of each TiKV node in Grafana during the corresponding time period. If it happens occasionally, it could also be a hotspot issue.

| username: 大飞飞呀 | Original post link

How to solve hotspot issues?

| username: zhanggame1 | Original post link

Walking the index shouldn’t be this slow. First, check if the cluster has any performance issues, such as high IO at times, CPU, and memory status. You can analyze this through monitoring.

| username: ealam_小羽 | Original post link

I encountered a similar issue before, where full table scans or slow queries on other tables caused instability and long wait times for queries that normally use indexes. I later asked an expert from PingCAP, and they mentioned that in versions prior to 7, resources would still affect each other. Version 7 introduced some optimizations for resource isolation.