Analysis of Possible Causes and Troubleshooting Directions and Solutions for Sudden Latency Increase and IO Rise in TiDB Database Cluster

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version] v7
[Reproduction Path] It seems that there have been continuous data write operations by the same job
[Encountered Issues: Problem Symptoms and Impact] Database query is slow, response is slow, query timeout, etc.
[Resource Configuration] Observed from TiDB Dashboard that latency, IO, etc., have increased and are unstable

Have slow queries increased? What abnormal SQL statements are there in topsql and SQL statement analysis?

Check the monitoring, CPU, disk I/O, etc., and see if there are more slow queries on the dashboard.

Business sharding, hotspots, slow SQL

Check large SQL

Check which SQL statements are causing the slow SQL.

Take a look at the top SQL and slow queries when extracting issues.

First, take a look at the slow SQL, then check the IO usage.

Slow SQL

Did you check the disk IO bottleneck?

Has the issue been resolved? You can check TiDB’s dashboard for clues.

