The table suddenly stopped using the index, causing the overall cluster load to spike

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 表突然不走索引,导致整体集群负载飙升

| username: select

To improve efficiency, please provide the following information. Clear problem descriptions can be resolved faster:
[TiDB Usage Environment]
Production environment

[Overview] Scenario + Problem Overview
Around 3 PM on Saturday, the SQL that was originally executing normally suddenly stopped using the index, causing the overall cluster load to spike.

[Background] What operations have been performed
Collected table statistics
[Phenomenon] Business and database phenomena
All queries against this table are performing full table scans. Using hints to force index usage works, but without forcing, all queries that should use the index do not. Statistics are normal.

12:28:31 sqldataviewpro> select * from INFORMATION_SCHEMA.TABLES where table_name=‘rtdw_dwd_order_bak’;
±--------------±---------------±-------------------±-----------±-------±--------±-----------±-----------±---------------±------------±----------------±-------------±----------±---------------±--------------------±------------±-----------±–
--------------±---------±---------------±--------------±--------------±--------------------------±-------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | TABLE_TYPE | ENGINE | VERSION | ROW_FORMAT | TABLE_ROWS | AVG_ROW_LENGTH | DATA_LENGTH | MAX_DATA_LENGTH | INDEX_LENGTH | DATA_FREE | AUTO_INCREMENT | CREATE_TIME | UPDATE_TIME | CHECK_TIME | TA
BLE_COLLATION | CHECKSUM | CREATE_OPTIONS | TABLE_COMMENT | TIDB_TABLE_ID | TIDB_ROW_ID_SHARDING_INFO | TIDB_PK_TYPE |
±--------------±---------------±-------------------±-----------±-------±--------±-----------±-----------±---------------±------------±----------------±-------------±----------±---------------±--------------------±------------±-----------±–
--------------±---------±---------------±--------------±--------------±--------------------------±-------------+
| def | sqldataviewpro | rtdw_dwd_order_bak | BASE TABLE | InnoDB | 10 | Compact | 11959764 | 468 | 5601425546 | 0 | 765424896 | 0 | NULL | 2022-08-27 16:05:11 | NULL | NULL | ut
f8mb4_bin | NULL | | Order main table | 9154 | NOT_SHARDED(PK_IS_HANDLE) | CLUSTERED |
±--------------±---------------±-------------------±-----------±-------±--------±-----------±-----------±---------------±------------±----------------±-------------±----------±---------------±--------------------±------------±-----------±–
--------------±---------±---------------±--------------±--------------±--------------------------±-------------+
1 row in set (0.05 sec)

12:28:36 sqldataviewpro> show create table rtdw_dwd_order_bak \G;
*************************** 1. row ***************************
Table: rtdw_dwd_order_bak
Create Table: CREATE TABLE rtdw_dwd_order_bak (
id bigint(20) unsigned NOT NULL COMMENT ‘Primary key’,
biz_type int(11) DEFAULT ‘0’ COMMENT ‘Business type’,
order_no varchar(32) DEFAULT ‘’ COMMENT ‘Order number’,
order_type int(11) DEFAULT ‘0’ COMMENT ‘Order type’,
order_sub_type int(11) DEFAULT ‘0’ COMMENT ‘Order subtype’,
seller_id bigint(20) DEFAULT ‘0’ COMMENT ‘Seller ID’,
store_id bigint(20) DEFAULT ‘0’ COMMENT ‘Store ID’,
store_order_no varchar(64) DEFAULT NULL COMMENT ‘Store order number’,
user_id bigint(20) DEFAULT ‘0’ COMMENT ‘User ID’,
mobile varchar(64) DEFAULT NULL COMMENT ‘Contact information’,
addr_id int(11) DEFAULT ‘0’ COMMENT ‘Shipping address ID’,
source int(11) DEFAULT ‘0’ COMMENT ‘Order source’,
source_no varchar(256) DEFAULT ‘’ COMMENT ‘Source order number’,
from_system int(11) DEFAULT ‘0’ COMMENT ‘Sender’,
source_info varchar(1024) DEFAULT NULL COMMENT ‘Source information’,
total_amount int(11) DEFAULT ‘0’ COMMENT ‘Total order amount’,
cost_amount int(11) DEFAULT ‘0’ COMMENT ‘Cost amount’,
pack_amount int(11) DEFAULT ‘0’ COMMENT ‘Packing fee’,
item_amount int(11) DEFAULT ‘0’ COMMENT ‘Total item amount’,
freight_amount int(11) DEFAULT ‘0’ COMMENT ‘Total freight amount’,
discount_amount int(11) DEFAULT ‘0’ COMMENT ‘Total discount amount’,
actual_amount int(11) DEFAULT ‘0’ COMMENT ‘Actual payment amount’,
pay_amount int(11) DEFAULT ‘0’ COMMENT ‘Payment amount’,
change_amount int(11) DEFAULT ‘0’ COMMENT ‘Change amount’,
pay_type int(11) DEFAULT ‘0’ COMMENT ‘Payment method’,
pay_model int(11) DEFAULT ‘0’ COMMENT ‘Settlement mode’,
payer bigint(20) DEFAULT ‘0’ COMMENT ‘Payer’,
pay_no varchar(256) DEFAULT NULL COMMENT ‘Payment order number’,
third_pay_no varchar(256) DEFAULT NULL COMMENT ‘Third-party payment order number’,
pay_info varchar(1024) DEFAULT NULL COMMENT ‘Payment information’,
is_alone int(11) DEFAULT ‘0’ COMMENT ‘Independent payment’,
pay_status int(11) DEFAULT ‘0’ COMMENT ‘Payment status’,
invoice_type int(11) DEFAULT ‘0’ COMMENT ‘Invoice type’,
performance_type int(11) DEFAULT ‘0’ COMMENT ‘Performance type’,
is_immediate_performance int(11) DEFAULT ‘0’ COMMENT ‘Immediate performance’,
performance_platform varchar(128) DEFAULT NULL COMMENT ‘Performance platform’,
performance_info varchar(1024) DEFAULT NULL COMMENT ‘Performance information’,
activity_info varchar(1024) DEFAULT NULL COMMENT ‘Activity information’,
item_info text DEFAULT NULL COMMENT ‘Item information’,
discount_info text DEFAULT NULL COMMENT ‘Discount information’,
seller_remarks text DEFAULT NULL,
buyer_remarks_temp text DEFAULT NULL COMMENT ‘User remarks’,
buyer_remarks text DEFAULT NULL COMMENT ‘Buyer remarks’,
is_receipt int(11) DEFAULT ‘0’ COMMENT ‘Printed’,
receipt_terminal varchar(64) DEFAULT NULL COMMENT ‘Printer terminal’,
status int(11) DEFAULT ‘0’ COMMENT ‘Status’,
after_sale_status int(11) DEFAULT ‘0’ COMMENT ‘Refund status’,
receive_status int(11) DEFAULT ‘0’ COMMENT ‘Order receiving status’,
performance_status int(11) DEFAULT ‘0’ COMMENT ‘Performance status’,
comment_status int(11) DEFAULT ‘0’ COMMENT ‘Comment status’,
refusal_reasons varchar(256) DEFAULT NULL COMMENT ‘Rejection reasons’,
order_at datetime DEFAULT NULL COMMENT ‘Order time’,
pay_at datetime DEFAULT NULL COMMENT ‘Payment time’,
comment_at datetime DEFAULT NULL COMMENT ‘Comment time’,
completed_at datetime DEFAULT NULL COMMENT ‘Order completion time’,
receive_at datetime DEFAULT NULL COMMENT ‘Order receiving time’,
features text DEFAULT NULL COMMENT ‘Extended information’,
sync_flag int(11) DEFAULT ‘0’ COMMENT ‘Sync flag’,
created_at datetime DEFAULT NULL COMMENT ‘Creation time’,
updated_at datetime DEFAULT NULL COMMENT ‘Modification time’,
is_deleted int(11) DEFAULT ‘0’ COMMENT ‘Deletion flag’,
version int(11) DEFAULT ‘0’ COMMENT ‘Version number’,
amount int(11) DEFAULT ‘0’,
wallet_amount int(11) DEFAULT ‘0’,
use_wallet int(11) DEFAULT ‘0’
PRIMARY KEY (id) /*T![clustered_index] CLUSTERED */,
KEY idx_order_at (order_at),
KEY idx_order_no (order_no),
KEY idx_store_seller_at (store_id,order_at),
KEY idx_store_pay (store_id,pay_at),
KEY idx_seller_order (seller_id,order_at),
KEY idx_updated_at (updated_at)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT=‘Order main table’
1 row in set (0.01 sec)

12:30:02 sqldataviewpro> SHOW stats_healthy WHERE table_name=‘rtdw_dwd_order_bak’;
±---------------±-------------------±---------------±--------+
| Db_name | Table_name | Partition_name | Healthy |
±---------------±-------------------±---------------±--------+
| sqldataviewpro | rtdw_dwd_order_bak | | 93 |
±---------------±-------------------±---------------±--------+

12:30:06 sqldataviewpro> EXPLAIN SELECT
→ *
→ FROM
→ rtdw_dwd_order_bak
→ WHERE order_no=‘BT6001611909130084’;
±-------------------------±------------±----------±-------------------------±---------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
±-------------------------±------------±----------±-------------------------±---------------------------------------------------------------------+
| TableReader_7 | 0.99 | root | | data:Selection_6 |
| └─Selection_6 | 0.99 | cop[tikv] | | eq(sqldataviewpro.rtdw_dwd_order_bak.order_no, “BT6001611909130084”) |
| └─TableRangeScan_5 | 11959764.00 | cop[tikv] | table:rtdw_dwd_order_bak | range:[0,+inf], keep order:false |
±-------------------------±------------±----------±-------------------------±---------------------------------------------------------------------+

12:30:53 sqldataviewpro> EXPLAIN SELECT
→ *
→ FROM
→ rtdw_dwd_order_bak FORCE INDEX(idx_order_no)
→ WHERE order_no=‘BT6001611909130084’;
±------------------------------±--------±----------±-------------------------------------------------------±--------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
±------------------------------±--------±----------±-------------------------------------------------------±--------------------------------------------------------------------+
| IndexLookUp_7 | 0.99 | root | | |
| ├─IndexRangeScan_5(Build) | 0.99 | cop[tikv] | table:rtdw_dwd_order_bak, index:idx_order_no(order_no) | range:[“BT6001611909130084”,“BT6001611909130084”], keep order:false |
| └─TableRowIDScan_6(Probe) | 0.99 | cop[tikv] | table:rtdw_dwd_order_bak | keep order:false |
±------------------------------±--------±----------±-------------------------------------------------------±--------------------------------------------------------------------+

[Problem] Current issues encountered

[Business Impact]
Business timeout
[TiDB Version]
5.2.3
[Application Software and Version]

[Attachments] Relevant logs and configuration information

  • TiUP Cluster Display information
  • TiUP Cluster Edit config information

Monitoring (https://metricstool.pingcap.com/)

  • TiDB-Overview Grafana monitoring
  • TiDB Grafana monitoring
  • TiKV Grafana monitoring
  • PD Grafana monitoring
  • Corresponding module logs (including logs 1 hour before and after the issue)

If the question is related to performance optimization or troubleshooting, please download the script and run it. Please select all and copy-paste the terminal output results for upload.

| username: select | Original post link

Only this table encountered such an issue. The solution at that time was to create a new table identical to this one, replace the old table, and backfill the historical data into the new table. The business then returned to normal.

| username: YuchongXU | Original post link

It is possible that changes in statistics have led to changes in the execution plan.

| username: select | Original post link

The statistics are correct, and the table health is also fine. However, none of the index queries for this table are working properly; they all result in full table scans. Previously, all queries that could use indexes are no longer doing so.

| username: db_user | Original post link

How is the situation with the check table? Is it due to issues with data or table metadata that caused the index to become invalid?

| username: select | Original post link

There is no check table. This is too resource-intensive, so I didn’t dare to operate it. How do you determine metadata issues?

| username: select | Original post link

The admin check table has no issues, the returned result is empty.

| username: h5n1 | Original post link

How was this confirmed?

| username: alfred | Original post link

Are all the indexes in a normal state?