Inconsistency Between TiDB Program and Manual Execution SQL Execution Plan

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB程序&手动执行SQL 执行计划不一致

| username: jiaxin

Question

The same SQL program queries data within one day slowly, but manually querying data within one day is fast. Recently, a certain AP-type business was migrated from MySQL to TiDB, and the application SQL was not modified.

Version

TiDB version 6.1.5

Mybatis prepare precompiled form of SQL

jdbc version
image

jdbc:mysql://%s:%s/%s?characterEncoding=UTF-8&rewriteBatchedStatements=true&allowMultiQueries=true&connectTimeout=2000&socketTimeout=600000

Table Structure and Index

# SQL statement
select xxxx FROM
  xxxxx_004 l
WHERE
  l.mart_code = 'xxxx'
  AND l.yn = 1
  AND l.supplier_code = ?
  AND l.biz_date >= ?
  AND l.biz_date <= ?
ORDER BY
  created DESC,
  id DESC
LIMIT
  40 [arguments: ("xxxx", "2024-01-18 00:00:00", "2024-01-18 00:00:00")];

# Table structure
CREATE TABLE `xxxxxx_004` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT,
  `mart_code` varchar(10) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `ledger_no` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `busi_no` varchar(650) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT xxxx,
  `supplier_code` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL COMMENT xxxx,
  `supplier_name` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `contract_no` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `contract_type` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `direct_source_no` varchar(64) COLLATE utf8mb4_unicode_ci DEFAULT '' COMMENT xxxx,
  `exec_detail_uk` varchar(64) COLLATE utf8mb4_unicode_ci DEFAULT '' COMMENT xxxx,
  `schedule_no` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `profit` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `discount_way` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `sku_id` bigint(20) DEFAULT NULL,
  `goods_code` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `goods_name` varchar(256) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `shop_code` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `shop_type` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `store_id` bigint(20) DEFAULT NULL COMMENT xxxx,
  `shop_name` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `rebate_type` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `contract_state` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `create_type` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `quantity` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `channel_type` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT xxxx,
  `ledger_amount_taxed` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `ledger_amount_untaxed` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `ledger_amount_attached` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `amount_taxed` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `amount_untaxed` decimal(20,5) DEFAULT NULL COMMENT xxxx,
  `goods_tax_code` varchar(16) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT xxxx,
  `goods_tax_ratio` decimal(20,4) DEFAULT NULL COMMENT xxxx,
  `tax_ratio` decimal(20,8) DEFAULT NULL COMMENT xxxx,
  `collect_date` date DEFAULT NULL COMMENT xxxx,
  `mode` tinyint(4) DEFAULT NULL COMMENT xxxx,
  `bill_no` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `order_no` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `biz_start_date` date DEFAULT NULL,
  `biz_end_date` date DEFAULT NULL COMMENT xxxx,
  `biz_date` date DEFAULT NULL COMMENT xxxx,
  `contract_day` date DEFAULT NULL COMMENT xxxx,
  `category_mi` varchar(32) COLLATE          DEFAULT '',
  `category_sm` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT '',
  `category` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT '',
  `category_la` varchar(32) COLLATE utf8mb4_unicode_ci DEFAULT '',
  `unit` varchar(16) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '' COMMENT xxxx,
  `created_user` varchar(64) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `extend_filed` text COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT xxxx,
  `remarks` varchar(128) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `created` datetime DEFAULT CURRENT_TIMESTAMP,
  `modified` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `yn` tinyint(4) NOT NULL DEFAULT '1')

# Table index
PRIMARY (id)
 uk_ledgerno (ledger_no,mart_code,yn)
 idx_billno (bill_no)
 idx_orderno (order_no)
 idx_collectdate_billno_orderno (collect_date,bill_no,order_no,supplier_code,contract_no,rebate_type,contract_type,profit)
 idx_directsourceno_rebatetype_busino (direct_source_no,rebate_type,busi_no,yn)
 idx_suppliercode_busino (supplier_code,busi_no)
 idx_suppliercode_contractno_bizdate (supplier_code,contract_no,biz_date)
 idx_suppliercode_bizdate (supplier_code,biz_date)
 idx_suppliercode_shopcode_bizdate (supplier_code,shop_code,biz_date)
 idx_bizdate (biz_date)
 idx_contractno (contract_no)
 idx_busino (busi_no)
 idx_suppliercode_bizstartdate (supplier_code,biz_start_date,biz_end_date)
 idx_created_suppliercode_rebatetype (created,supplier_code,rebate_type)
 idx_modified_created_mart (modified,created,mart_code)

Performance

Business query for data within one day times out on the page

Business query for data within multiple days is faster (uses the correct composite index)

Investigation and Analysis

Several slow queries of over 3 minutes in TiDB dashboard

Table statistics health is: 99 Good

No pressure on the TiDB cluster itself

The program query for data within one day and manual execution use different indexes (same SQL)

Program Execution (TiDB) Manual Execution (TiDB) Manual Execution (MySQL)
Query condition: WHERE l.mart_code = ‘xxxx’ AND l.yn = 1 AND l.supplier_code = ? AND l.biz_date >= ? AND l.biz_date <= ? ORDER BY created DESC, id DESC LIMIT 40 [arguments: (“xxxx”, “2024-01-18 00:00:00”, “2024-01-18 00:00:00”)];
Uses index: idx_suppliercode_bizstartdate (supplier_code, biz_start_date, biz_end_date), effectively only filtering by supplier_code, not using biz_date index → Takes 3.5 minutes (unexpected) Uses index: idx_suppliercode_bizdate (supplier_code, biz_date) → Takes 183ms (expected) Uses index: idx_suppliercode_bizdate (supplier_code, biz_date) → Takes 54ms (expected)

Screenshot Information:
Program Execution (TiDB)


Manual Execution (TiDB)


Manual Execution (MySQL)


Temporary Solution

  • Temporarily add new query conditions to the program (SQL uses the composite index)

  • Developers add force index in the code SQL (many SQL types, difficult to modify)

  • Create execution plan binding at the TiDB level (many SQL rows need parameterization, difficult to create)

For example
CREATE BINDING FOR SELECT * FROM t WHERE a > 1 USING SELECT * FROM t use index(idx) WHERE b > 2;

Resolved

  • After disabling the execution plan cache on the compute nodes, the business application developers removed the forced index in the code, and the page report query for data within one day became faster (used the correct composite index)
# Execute on 3 compute nodes respectively
ADMIN FLUSH INSTANCE PLAN_CACHE;
# Execute globally once
set global tidb_enable_prepared_plan_cache=false; 

| username: Kongdom | Original post link

Was the SQL manually executed completely copied from the slow query page? I have encountered several times where slow queries were caused by type conversion in the WHERE condition, but when manually executing, the parameters given were of the correct type.

| username: wangccsy | Original post link

I personally feel that this should be possible, especially if you are using an ORM tool. This is because the ORM itself (or the implementation of JDBC) will perform some SQL optimizations.

| username: jiaxin | Original post link

The slow SQL in the dashboard is the same as the records I saw when I previously opened the general_log in TiDB.

| username: jiaxin | Original post link

I previously checked the general_log of TiDB and didn’t find any settings for session variables (I suspected there might be, but it’s also possible that the TiDB general_log didn’t output the set session variable statements).

| username: Kongdom | Original post link

After handling it this way in the program, will the SQL execution plan in the program be the same as the manually executed SQL plan?

| username: WinterLiu | Original post link

Use the new, not the old. Upgrading TiDB might solve the problem. :grin:

| username: jiaxin | Original post link

Yes.

| username: jiaxin | Original post link

Understood, please provide the Chinese text you need translated.

| username: Kongdom | Original post link

After using force index, can you check if the content of the latter column is the same in both cases?

| username: 小龙虾爱大龙虾 | Original post link

I suspect it’s an issue with the PLAN_CACHE :joy:. Your range-bound values l.biz_date >= and l.biz_date are equal, causing the execution plan to not use the range scan of this composite index. I suggest:
The mybatis framework should have a property, likely setting statementType to STATEMENT, to prevent this SQL from using the prepare statement interface. Give it a try.

| username: 有猫万事足 | Original post link

The guess about the plan cache from the previous comment seems quite plausible.

After all, if it’s not the plan cache, it’s hard to think of why the same SQL with the same conditions would have different execution plans when run manually versus programmatically.

You might consider clearing the plan cache on a specific TiDB instance to see if everything becomes slower or faster.

| username: Trouble | Original post link

Check the JDBC connection parameter useServerPrepStmts to see if both parts are consistent.

| username: redgame | Original post link

This is puzzling… It shouldn’t happen with exactly the same setup.

| username: changpeng75 | Original post link

It seems that there is an issue with the index planning, and the number of indexes is a bit too high.
For composite indexes, you can try putting the date field at the front and the supplier_code at the back.

| username: jiaxin | Original post link

It might really be related to the plan cache. I’ll verify it. Previously, the memory of the computing nodes in another high-TPS TiDB cluster kept increasing, but after disabling the execution plan cache, the memory returned to normal.

| username: jiaxin | Original post link

The issue has been resolved. After turning off the execution plan cache for the compute node, the page report query responds quickly and meets expectations (using the correct composite index).

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.