Coprocessor Execution Time is Relatively Long

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Coprocessor 执行耗时时间比较长

| username: yulei7633

When executing an SQL statement, the slow query time displayed on the dashboard is inconsistent with the actual execution time, as shown in the figure below:

It takes about 5 seconds.
However, the slow query time for the same SQL statement on the dashboard shows 11.5 seconds.

I don’t know the reason:
Carefully analyzing the execution time of the SQL statement:


Can any expert help analyze the reason? Good reviews can mark the best answer.

| username: yulei7633 | Original post link

The time exceeds 11.5 seconds. Is it because the total time after parallel execution is summed up? This table is a partitioned table with 128 partitions. The execution plan shows that each partition has data that needs to be counted.

| username: yulei7633 | Original post link

The time queried from the slow query table (slow_query) is indeed 11.5 seconds. The time I manually executed is 5.10 seconds, but the slow query table records 11.5 seconds, which I don’t quite understand.

| username: tidb菜鸟一只 | Original post link

When you manually execute it on the client, is it the first time or have you executed it multiple times? There might be caching involved. I suggest adding sql_no_cache to see if it’s still slow. The one in the dashboard might not be the one you manually executed, but rather the one from the application.

| username: yulei7633 | Original post link

Currently, it is a test environment, and the program is controlled by me personally. At the time of testing, the program did not have any effective statements. When executing manually, it was executed many times, not the first time. The slow queries in the dashboard are indeed the statements I just executed manually.

| username: 像风一样的男子 | Original post link

The total time taken by the coprocessor, which involves multiple KV multi-threaded parallel executions, can exceed the SQL execution time.

| username: yulei7633 | Original post link

I understand this now. Thank you.

| username: tidb菜鸟一只 | Original post link

Try adding sql_no_cache to check the execution time each time. For the same client, only the first execution will be a bit slow, and it will be faster after caching.

| username: yulei7633 | Original post link

How exactly do you add this? Can you write it out? For example: select count(*) from t;

| username: Fly-bird | Original post link

SELECT SQL_NO_CACHE count(1) FROM $table;

| username: tidb菜鸟一只 | Original post link

SELECT SQL_NO_CACHE COUNT(*) FROM t;
| username: yulei7633 | Original post link

Got it, I’ll test it.

| username: yulei7633 | Original post link

Whether you add it or not, the effect is the same.

| username: TiDBer_小阿飞 | Original post link

  1. First, use Explain Analyze + SQL to track the time distribution and see which operator is taking the most time.
  2. The cumulative execution time of the Coprocessor appears to be greater than the SQL execution time because TiKV processes tasks in parallel, so the cumulative execution time is not the natural elapsed time.
  3. Check the basic information page of the slow query on the dashboard to see if the “used statistics” are pseudo.
  4. If the statistics are pseudo, use Analyze table xxxx to recollect the statistics for this table, then re-execute the query with Explain Analyze + SQL to see if there is any change in the estRows count compared to before execution.
  5. Modify the parameter: pseudo-estimate-ratio. This parameter represents the ratio of modified rows to the total number of rows in the table. When this ratio exceeds the set value, the system considers the statistics to be outdated and will use pseudo statistics. The default value is 0.8, with a minimum value of 0 and a maximum value of 1. It is the criterion for determining whether the statistics are invalid.
    You can adjust this parameter to 1 so that TiKV does not choose pseudo statistics when executing SQL.