The "select count(1) from table is very slow"

translator_bot · June 22, 2024, 4:04pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: select count(1) from 表很慢

| username: TiDBer_FMWXa7ja

【TiDB Usage Environment】Production Environment / Testing / PoC
【TiDB Version】6.5
【Reproduction Path】
【Encountered Problem: Problem Phenomenon and Impact】TiFlash is not installed, and performing select count(1) from on a table with 1 billion rows is very slow. Is this normal? What are the possible solutions?
【Resource Configuration】
【Attachments: Screenshots/Logs/Monitoring】

translator_bot · June 22, 2024, 4:04pm

| username: buddyyuan | Original post link

Check the resource usage of TiKV.

translator_bot · June 22, 2024, 4:04pm

| username: Kongdom | Original post link

Does the table have a primary key? Are there conditions like “order by” in the complete statement?

translator_bot · June 22, 2024, 4:04pm

| username: h5n1 | Original post link

If there is no index, count(1) will scan the entire table, while count(*) will use the primary key index, etc.

translator_bot · June 22, 2024, 4:04pm

| username: Kongdom | Original post link

Is there a difference between these two? It feels like in actual use, count(1) and count() have similar effects, but internally, our company has always believed that count(1) is more efficient than count().

translator_bot · June 22, 2024, 4:04pm

| username: TiDBer_FMWXa7ja | Original post link

The table has a composite primary key and no conditions. select count(*) from d_secondminutedata;

translator_bot · June 22, 2024, 4:04pm

| username: Kongdom | Original post link

Execute this statement to see the execution plan:
explain analyze select count(*) from d_secondminutedata;

translator_bot · June 22, 2024, 4:04pm

| username: h5n1 | Original post link

I tried it and it’s the same.

translator_bot · June 22, 2024, 4:04pm

| username: Kongdom | Original post link

Practical expert

translator_bot · June 22, 2024, 4:04pm

| username: TiDBer_FMWXa7ja | Original post link

The image is not visible. Please provide the text you need translated.

translator_bot · June 22, 2024, 4:04pm

| username: h5n1 | Original post link

What is your resource configuration and disk type? 47 seconds for 1 billion rows is not bad.

translator_bot · June 22, 2024, 4:04pm

| username: TiDBer_FMWXa7ja | Original post link

6 KV nodes, each machine has 16 cores and 16GB of memory. 47 seconds is the result of the second execution, the first execution took almost 200 seconds.

translator_bot · June 22, 2024, 4:04pm

| username: 我是咖啡哥 | Original post link

I have encountered this before. If the DML on this table is very frequent, it will be slow. After stopping the application, the result comes out in seconds.
For around 300,000 rows of data, there’s a difference between 12 seconds and 0.2 seconds.

translator_bot · June 22, 2024, 4:04pm

| username: h5n1 | Original post link

What type of disk is it? Please upload the results of the explain analyze. Some parts of the screenshot are not visible.

translator_bot · June 22, 2024, 4:04pm

| username: buddyyuan | Original post link

Upload the relevant information of this statement in the slow log.

translator_bot · June 22, 2024, 4:04pm

| username: liuis | Original post link

Let’s take a look at the explain analyze.

translator_bot · June 22, 2024, 4:04pm

| username: WalterWj | Original post link

With 1 billion records, if TiKV is not very busy and there are enough CPUs, it should be able to complete the count in just a few seconds.

translator_bot · June 22, 2024, 4:04pm

| username: ffeenn | Original post link

What is the configuration size for TiDB?

translator_bot · June 22, 2024, 4:04pm

| username: 人如其名 | Original post link

It seems that TiDB equates count(1) and count(*) at the parser level.

translator_bot · June 22, 2024, 4:04pm

| username: 人如其名 | Original post link

Look at this, it’s so slow…

mysql> explain analyze select /*+ read_from_storage(tikv[lineitem]) */ count(*) from lineitem;
+----------------------------+-------------+----------+-----------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------+-----------+------+
| id                         | estRows     | actRows  | task      | access object  | execution info                                                                                                                                                                                                                                                                                                                                                                                   | operator info                     | memory    | disk |
+----------------------------+-------------+----------+-----------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------+-----------+------+
| HashAgg_13                 | 1.00        | 1        | root      |                | time:39.4s, loops:2, partial_worker:{wall_time:39.362488482s, concurrency:5, task_num:1, tot_wait:3m16.81108575s, tot_exec:94.27µs, tot_time:3m16.811192801s, max:39.362249552s, p95:39.362249552s}, final_worker:{wall_time:0s, concurrency:5, task_num:1, tot_wait:3m16.811393418s, tot_exec:30.083µs, tot_time:3m16.811426486s, max:39.362296841s, p95:39.362296841s}                         | funcs:count(Column#18)->Column#17 | 16.5 KB   | N/A  |
| └─TableReader_14           | 1.00        | 156      | root      |                | time:39.4s, loops:2, cop_task: {num: 159, max: 7.69s, min: 212.8ms, avg: 3.54s, p95: 7.15s, max_proc_keys: 462798, p95_proc_keys: 462756, tot_proc: 8m47.6s, tot_wait: 33.5s, rpc_num: 159, rpc_time: 9m23.4s, copr_cache_hit_ratio: 0.00, build_task_duration: 274.2µs, max_distsql_concurrency: 15}, backoff{regionMiss: 2ms}                                                                  | data:HashAgg_6                    | 438 Bytes | N/A  |
|   └─HashAgg_6              | 1.00        | 156      | cop[tikv] |                | tikv_task:{proc max:7.2s, min:212ms, avg: 3.32s, p80:5.75s, p95:6.83s, iters:70417, tasks:159}, scan_detail: {total_process_keys: 72080384, total_process_keys_size: 2594893824, total_keys: 73692884, get_snapshot_time: 16.4s, rocksdb: {delete_skipped_count: 130912, key_skipped_count: 73762895, block: {cache_hit_count: 2208, read_count: 455143, read_byte: 4.43 GB, read_time: 5.11s}}} | funcs:count(1)->Column#18         | N/A       | N/A  |
|     └─TableFullScan_12     | 70899712.00 | 72080384 | cop[tikv] | table:lineitem | tikv_task:{proc max:7.2s, min:211ms, avg: 3.32s, p80:5.75s, p95:6.83s, iters:70417, tasks:159}                                                                                                                                                                                                                                                                                                   | keep order:false                  | N/A       | N/A  |
+----------------------------+-------------+----------+-----------+----------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------+-----------+------+
4 rows in set (39.36 sec)

Main slow points:

copr_cache_hit_ratio=0, no cache hit on the TiDB side.
get_snapshot_time: 16.4s, not sure why it’s so slow…
read_byte: 4.43 GB, read_time: 5.11s, physical read occurred.
Table scan is slow, my understanding is: time:39.4s - get_snapshot_time: 16.4s - read_time: 5.11s = 17.89s
17.89s * 15 / tasks:159 = 1.69s, pure memory read of data from one region takes an average of 1.69 seconds, feels very slow…
At that time, data was being loaded concurrently, which should be due to insufficient CPU. But why is get_snapshot_time so slow?

Sometimes 15 tasks working together can barely keep up with one task in a traditional database. It feels like the data read and scan efficiency of TiKV is not high.