Select count() is very slow

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: select count() 非常慢

| username: TiDBer_8aM3dZG6

[TiDB Usage Environment] Production Environment
[TiDB Version] V6.5.1
[Reproduction Path]
[Encountered Problem: Problem Phenomenon and Impact]
A table with less than 100 million records is very slow when using the following select count query, often taking twenty to thirty seconds. An index has already been added to the query condition column, but the reason for the slowness is unknown.
explain analyze SELECT count(c_ckey)
FROM vm_company_derived_variables_all
WHERE (operated_years > 2)
AND (c_regist_capi > 1000.0)
[Resource Configuration]
One physical machine, virtualized into 10 instances, PD: 3 instances, TIDB: 2 instances, each with 32GB memory, TIKV: 4 instances, each with 32GB memory. SSD hard drive.
[Attachment: Screenshot/Log/Monitoring]
explain analyze as attachment
explain.txt (1.8 KB)

| username: redgame | Original post link

Try this:

ALTER TABLE vm_company_derived_variables_all ADD INDEX idx_operated_years_c_regist_capi (operated_years, c_regist_capi);
| username: zhanggame1 | Original post link

What indexes have been added to this table specifically? Are there indexes for the two query conditions?

| username: Kongdom | Original post link

How about trying count(1)?

| username: zhanggame1 | Original post link

The performance is not affected by what is written after count.

| username: TiDBer_8aM3dZG6 | Original post link

Both columns in the query have indexes.

| username: TiDBer_8aM3dZG6 | Original post link

Both columns being queried have indexes. I just gave this example, but there are many columns that need to be queried, and the combinations of conditions are not fixed. Do I need to add composite indexes for all of them?

| username: zhanggame1 | Original post link

Have you considered trying TiFlash for single table queries?

| username: linnana | Original post link

Columnar storage TiFlash is relatively fast for statistical queries.

| username: 人如其名 | Original post link

In this case, it is highly recommended to use TiFlash; otherwise, creating too many indexes will result in high maintenance costs. Additionally, setting distsql_concurrency to 256 is too extreme. It can cause the entire cluster to become unstable during large queries. It is still recommended to use the default value of 15.

| username: Anna | Original post link

Columnar storage TiFlash is relatively fast for statistical queries.

| username: cy6301567 | Original post link

Have you updated the statistics?

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.