Distsql Duration Execution Time is Very Long

translator_bot · June 22, 2024, 2:59pm

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: Distsql Duration 执行时间很长

| username: Atlan

【TiDB Usage Environment】Production Environment
【TiDB Version】v6.5.0
【Reproduction Path】

The SQL statement is as follows:
SELECT
  r.`room_id`,
  t.`room_name`,
  t.`priority`,
  t.`creator`,
  t.`queue_size`,
  t.`description`,
  t.`speak_limit`
FROM
  `t_room_member` r
  INNER JOIN `t_room_info` t ON r.`room_id` = t.`room_id`
WHERE
  r.`member_id` = ?
  AND r.`room_id` > ?
  AND t.`delete_time` = 0
ORDER BY
  r.`room_id`
LIMIT
  ? [arguments: ("20000001142001", "", 500)]


DROP TABLE IF EXISTS `t_room_info`;
CREATE TABLE `t_room_info` (
  `id` bigint(20) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */,
  `room_id` varchar(32) NOT NULL,
  `room_name` varchar(32) NOT NULL,
  `priority` int(11) NOT NULL,
  `creator` varchar(32) NOT NULL,
  `queue_size` int(11) NOT NULL,
  `description` text NOT NULL,
  `create_time` bigint(20) NOT NULL,
  `rorg_id` varchar(32) NOT NULL,
  `delete_time` bigint(20) NOT NULL,
  `speak_limit` int(11) NOT NULL DEFAULT '120',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  UNIQUE KEY `idx_room_id` (`room_id`),
  KEY `idx_room_creator` (`creator`),
  KEY `idx_rorg_id_room_id` (`rorg_id`,`room_id`),
  KEY `idx_rorg_id_create_time` (`rorg_id`,`create_time`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![auto_rand_base] AUTO_RANDOM_BASE=570001 */;

DROP TABLE IF EXISTS `t_room_member`;
CREATE TABLE `t_room_member` (
  `id` bigint(20) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */,
  `room_id` varchar(32) NOT NULL,
  `member_id` varchar(32) NOT NULL,
  `priority` int(11) NOT NULL,
  `create_time` bigint(20) NOT NULL,
  `speak_limit` int(11) NOT NULL,
  `rorg_id` varchar(32) NOT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `idx_member_id` (`member_id`),
  UNIQUE KEY `index_roomID_userID` (`room_id`,`member_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin /*T![auto_rand_base] AUTO_RANDOM_BASE=280388601 */;

The usage statement is as follows:
【Resource Configuration】
CPU: 16 cores, Memory: 128G, Disk: 20T
All TiDB configurations are installed by default using tiup

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

Check the TiKV details → thread CPU, check if the QPS has increased during this period, look at slow SQL, check CPU utilization, and examine the IO and network during this period.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The images you provided are not visible. Please provide the text content for translation.

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

The CPU is bottlenecked. There should be a large number of reads, right?

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The image is not visible. Please provide the text content for translation.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The number of reads is not very large.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

There were no issues before. I have three machines with the above configuration, all originally running Ubuntu 20.04. Later, I reinstalled one machine with Ubuntu 22.04. After that, I rebuilt the cluster, and now the business operations are like this.

translator_bot · June 22, 2024, 2:59pm

| username: tidb菜鸟一只 | Original post link

Your SQL doesn’t seem to have any major issues, why did you bring it up separately?

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

Is this a physical machine deployment? How many CPUs? Show config where name like ‘%readpool%max-thread-count%’, and check other slow SQLs.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

KVM on China Mobile Cloud

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The image you provided is not visible. Please provide the text you need translated.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

This SQL is very slow.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The image is not available for translation. Please provide the text content for translation.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

The image you provided is not accessible. Please provide the text you need translated.

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

The SQL you provided seems to take just over 1 second to execute. If there are no other large data volumes or high concurrency SQLs, it is likely an environment issue. I have encountered a similar problem with k8a where incorrect cgroup settings caused the read pool CPU to be very high whenever there was SQL activity. However, in my case, it was a persistent issue. I feel that your problem might be similar.

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

Take a look at the QPS changes on the TiDB monitoring page.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

Sorry, I can’t translate the content from the image. Please provide the text you need translated.

translator_bot · June 22, 2024, 2:59pm

| username: h5n1 | Original post link

QPS increases and duration remains consistent, but why is the CPU maxed out? We need to analyze the SQL and execution plan.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

What SQL? Let me know what you need and I’ll send you a screenshot.

translator_bot · June 22, 2024, 2:59pm

| username: Atlan | Original post link

I really don’t know what “毛sql” means. I thought if you wanted to analyze it, I could give you a screenshot.