Explain how to understand key_skipped_count in the execution plan

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: explain 执行计划中怎么理解key_skipped_count

| username: Raymond

Background: Testing tpch’s orders
I found the key_skipped_count in the execution plan confusing and don’t know what it really means.
Note: No data has been deleted. After generating the table data for tpch’s orders, it was directly counted.

mysql> select count() from orders where O_ORDERDATE >= “1995-01-01” and O_ORDERDATE <= “1995-01-03”;
±---------+
| count(
) |
±---------+
| 1811 |
±---------+
1 row in set (0.00 sec)

mysql> explain analyze select count(*) from orders where O_ORDERDATE >= “1995-01-01” and O_ORDERDATE <= “1995-01-03”;
±----------------------------±--------±--------±----------±---------------------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±------------------------------------------------±----------±-----+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
±----------------------------±--------±--------±----------±---------------------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±------------------------------------------------±----------±-----+
| StreamAgg_17 | 1.00 | 1 | root | | time:2.56ms, loops:2 | funcs:count(Column#12)->Column#10 | 388 Bytes | N/A |
| └─IndexReader_18 | 1.00 | 1 | root | | time:2.55ms, loops:2, cop_task: {num: 1, max: 2.49ms, proc_keys: 1811, rpc_num: 1, rpc_time: 2.47ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 15} | index:StreamAgg_9 | 267 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 1 | cop[tikv] | | tikv_task:{time:1ms, loops:2}, scan_detail: {total_process_keys: 1811, total_process_keys_size: 83306, total_keys: 1812, get_snapshot_time: 890.3µs, rocksdb: {key_skipped_count: 1811, block: {cache_hit_count: 9}}} | funcs:count(1)->Column#12 | N/A | N/A |
| └─IndexRangeScan_16 | 257.81 | 1811 | cop[tikv] | table:orders, index:O_ORDERDATE(O_ORDERDATE) | tikv_task:{time:1ms, loops:2} | range:[1995-01-01,1995-01-03], keep order:false | N/A | N/A |
±----------------------------±--------±--------±----------±---------------------------------------------±-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±------------------------------------------------±----------±-----+
4 rows in set (0.01 sec)

| username: zhanggame1 | Original post link

First, analyze the table, otherwise, there will be issues with the execution plan.

| username: redgame | Original post link

The number of skipped index keys. This value is usually non-zero when an index is used.

| username: Kongdom | Original post link

Rocksdb_key_skipped_count: The number of deleted (tombstone) keys encountered when RocksDB scans data.

| username: Raymond | Original post link

How should this be understood?

| username: 春风十里 | Original post link

I haven’t seen this situation here. Did the original poster delete the table?

| username: tidb菜鸟一只 | Original post link

In fact, it refers to the data that has been deleted but not yet garbage collected. TiKV needs to filter out this data during scanning.

| username: 有猫万事足 | Original post link

When deleting data, it is not directly deleted but marked as deleted by changing its status. The same principle applies to updating data: a new record is written, and the old record is marked as deleted. The actual deletion of the record happens during the TiKV garbage collection (GC).

A high number indicates that the table is frequently updated or deleted. If the GC is progressing normally, it means this is expected behavior and can be ignored. You can check the progress of the GC in the TiKV details panel under the GC group in Grafana.