TiDB Concurrent Queries Are Slow

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb并发查询慢

| username: TiDBer_7Q5CQdQd

This is a slow query log for a test page. Each slow query, when checked individually, takes less than 1 second.

So, I suspect that it is due to insufficient resources. How can I further confirm this?

Below are the execution plans for a few slow queries:

pd, tidb share 3 machines with 8 cores and 16GB RAM each
tikv has 3 machines with 12 cores and 24GB RAM each
tiflash has 3 machines with 24 cores and 48GB RAM each

Execution plans are as follows:
Slow query log, fourth entry 4s, individual query 0.9s

Slow query log, fifth entry 3.9s, individual query 0.85s

Slow query log, fifth entry 3.5s, individual query 0.85s

This is my tiflash monitoring at that time, with a final CPU utilization rate of 184. Strangely, when I used the top command to check the CPU status on the server, it had already reached 1600. I don’t know what’s wrong.

| username: 昵称想不起来了 | Original post link

Try adjusting parameters for optimization? How to speed up Count?

| username: tidb菜鸟一只 | Original post link

Can you check the SQL to see exactly where it’s slow?

| username: 像风一样的男子 | Original post link

Are two execution plans with different durations the same?

| username: TiDBer_7Q5CQdQd | Original post link

The same.

| username: TiDBer_7Q5CQdQd | Original post link

Executing a single SQL is not slow. I suspect that the parallel execution of SQL is causing insufficient CPU resources, but I am only executing about 10 SQLs in parallel.

| username: TiDBer_7Q5CQdQd | Original post link

It’s not that the count query is slow; it’s that executing it individually is fast, but executing it in parallel is slow.

| username: tidb菜鸟一只 | Original post link

On the dashboard page, directly click on the slow SQL, it will show specifically which stage is slow. Take a look first.

| username: TiDBer_7Q5CQdQd | Original post link

The image you provided is not visible. Please provide the text you need translated.

| username: 像风一样的男子 | Original post link

Check the KV monitoring for CPU, memory, and IO.

| username: TiDBer_7Q5CQdQd | Original post link

The CPU usage is very low, around 20%. The memory usage stays around 60%.

| username: 有猫万事足 | Original post link

All of your queries are executed in TiFlash MPP mode. TiFlash’s support for high concurrency is not as good as TiKV’s. It’s expected that individual executions are fast, but performance drops quickly with a dozen or so concurrent executions.

| username: TiDBer_7Q5CQdQd | Original post link

That means 3 servers with 24 cores each can’t handle over 10 MPP SQL queries. How many CPU cores would be needed then? Or would it be better if I switch back to the original self-select execution mode?

| username: 有猫万事足 | Original post link

It is best to look at the execution plan of slow queries that take 3-4 seconds. See which stage is causing the slowness.

| username: tidb菜鸟一只 | Original post link

The support for TiFlash here is not good enough. If it’s TiKV, it will show which stage is slow in TiKV, but for TiFlash, you can only see that it is indeed slow in TiFlash. Post the SQL from the dashboard, change it to explain analyze SELECT SQL_MO_CACHE and execute it to see the execution plan.