Issue of High Concurrency SQL Transfer Between Different KVs in TopSQL

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: topsql中高并发sql在各个kv之间转移问题

| username: 像风一样的男子

In the topsql monitoring, it was observed that a highly concurrent SQL was running in the KV of node 147 during the day yesterday. At 2 AM, it suddenly moved to the KV of node 154 and executed for an hour. At 3 AM, this highly concurrent SQL moved to node 150. What is the principle behind this??

All nodes have not been restarted.

SQL statement:

SELECT
  `bi`.`id` `bikeid`,
  `bi`.`vin`,
  `sku`.`id` `skuid`,
  `sku`.`color` `skucolor`,
  `sku`.`img_url` `skuimageurl`,
  `m`.`id` `modelid`,
  `m`.`model_name` `modelname`,
  `m`.`model_no` `modelno`,
  `type`.`id` `typeid`,
  `type`.`type_name` `typename`,
  `type`.`type_code` `typecode`,
  `series`.`id` `seriesid`,
  `series`.`series_name` `seriesname`,
  `series`.`series_code` `seriescode`,
  `m`.`tire_diameter`,
  `m`.`pole_pair`
FROM
  `t_bike_info` `bi`
  LEFT JOIN `t_bike_sku` `sku` ON `bi`.`sku_id` = `sku`.`id`
  LEFT JOIN `t_bike_model` `m` ON `sku`.`model_id` = `m`.`id`
  LEFT JOIN `t_bike_type` TYPE ON `m`.`type_id` = `type`.`id`
  LEFT JOIN `t_bike_series` `series` ON `m`.`series_id` = `series`.`id`
WHERE
  `bi`.`vin` = ?
  AND `bi`.`del_flag` = ?
  AND `sku`.`del_flag` = ?
  AND `m`.`del_flag` = ?
  AND `type`.`del_flag` = ?
  AND `series`.`del_flag` = ?

The involved table structures all have id bigint(19) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */ COMMENT ‘Primary Key’.

| username: tidb狂热爱好者 | Original post link

This is just hotspot reading.

| username: 像风一样的男子 | Original post link

So why would he transfer?

| username: 像风一样的男子 | Original post link

Table Structure

CREATE TABLE `t_bike_info` (
  `id` bigint(19) NOT NULL /*T![auto_rand] AUTO_RANDOM(5) */ COMMENT 'Primary Key',
  `sku_id` bigint(19) NOT NULL DEFAULT '0',
  `vin` varchar(64) COLLATE utf8_general_ci NOT NULL DEFAULT '',
  `vin_pwd` varchar(64) COLLATE utf8_general_ci DEFAULT '',
  `sap_vin` varchar(64) COLLATE utf8_general_ci DEFAULT '-1',
  `bind_code` varchar(64) COLLATE utf8_general_ci DEFAULT NULL,
  `frame_manu` varchar(32) COLLATE utf8_general_ci DEFAULT NULL,
  `sc_date` timestamp(3) NULL DEFAULT NULL COMMENT 'Production Date',
  `fh_date` timestamp(3) NULL DEFAULT NULL COMMENT 'Shipping Date',
  `dealer_no` varchar(32) COLLATE utf8_general_ci DEFAULT NULL,
  `djh` varchar(200) COLLATE utf8_general_ci DEFAULT NULL,
  `qc` varchar(60) COLLATE utf8_general_ci DEFAULT NULL,
  `iot_id` bigint(19) DEFAULT NULL,
  `ble_id` bigint(19) DEFAULT NULL,
  `batt_type` tinyint(2) DEFAULT '-1',
  `user_id` bigint(19) DEFAULT NULL,
  `comm_module_type` tinyint(2) DEFAULT '0',
  `meter_id` bigint(19) DEFAULT NULL,
  `material_number` varchar(64) COLLATE utf8_general_ci DEFAULT NULL,
  `zone_code` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  `zone_name` varchar(60) COLLATE utf8_general_ci DEFAULT NULL COMMENT 'Base Name',
  `country_code` varchar(30) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  `country_name` varchar(100) COLLATE utf8_general_ci DEFAULT NULL,
  `source` tinyint(2) DEFAULT NULL,
  `smart_type` tinyint(2) DEFAULT '-1',
  `bt_unique_key` varchar(64) COLLATE utf8_general_ci NOT NULL,
  `experimental` tinyint(2) NOT NULL DEFAULT '1',
  `del_flag` tinyint(2) DEFAULT '0',
  `create_time` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
  `update_time` timestamp(3) DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED*/,
  KEY `i_bike_info_vin_create_time` (`vin`,`create_time`),
  KEY `i_bike_info_create_time` (`create_time`),
  KEY `i_bike_info_user` (`user_id`),
  KEY `i_bike_info_iot_id` (`iot_id`),
  KEY `i_bike_info_meter_id` (`meter_id`),
  KEY `i_bike_info_ble_id` (`ble_id`),
  KEY `idx_vbimdc` (`vin`,`ble_id`,`iot_id`,`meter_id`,`del_flag`,`create_time`),
  KEY `idx_t_bike_info_sap_vin_normal` (`sap_vin`),
  KEY `i_bike_info_experimental` (`experimental`),
  KEY `idx_t_bike_info_bind_code_normal` (`bind_code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci /*T![auto_rand_base] AUTO_RANDOM_BASE=4712456*/;
| username: forever | Original post link

Is there a large amount of data being written? Is it possible that the region has split and moved to another node?

| username: 像风一样的男子 | Original post link

The possibility of region splitting is very low for these tables since they have almost no writes, only reads.

| username: 像风一样的男子 | Original post link

Region distribution

| username: xfworld | Original post link

If the data being read changes and happens to be distributed across different nodes, it will just switch nodes to retrieve it.

Ideally, if each node has data and it is sufficiently dispersed, it can achieve greater throughput with smaller resource acquisition.

Besides topSQL, you can observe the traffic visually through the dashboard to see if this table is very active. If it is very active, it indicates that there is indeed a hotspot (also known as read skew or write skew).

| username: forever | Original post link

TIDB_HOT_REGIONS_HISTORY can be used to check if the hot regions during this period are the same object.

| username: 像风一样的男子 | Original post link

This table cannot be found on the heatmap.

| username: 像风一样的男子 | Original post link

I checked the TIDB_HOT_REGIONS_HISTORY and found no records in this table.

| username: tidb狂热爱好者 | Original post link

Explain that this is not a heat meter.

| username: 小龙虾爱大龙虾 | Original post link

Regions can be moved. First, check if there are any optimizations that can be made to the execution plan.

| username: 像风一样的男子 | Original post link

The average query speed of this SQL is more than 40 milliseconds, with limited room for optimization.

| username: forever | Original post link

What does this SQL do? Is it similar to a scheduled task, or does it use a time-based index to query data from a recent period?

| username: 像风一样的男子 | Original post link

It’s not a scheduled task; it’s called by the client.

| username: forever | Original post link

Does this situation match?

| username: 像风一样的男子 | Original post link

It’s not about querying data for a period of time, it’s about querying information based on the user’s unique ID, and the result is only one record.

| username: forever | Original post link

I have no idea. Looking at the node status, it basically changes with the pressure in chronological order, which is similar to querying the latest data by time index. After checking by id, the table structure seems to be scattered, so it doesn’t seem like this. Waiting for experts to reply.

| username: zhanggame1 | Original post link

Check the TiDB Dashboard traffic visualization to see which regions have high traffic during that time.