CPU Surge

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: CPU 暴涨

| username: TiDBer_NxOGHZx6

[TiDB Usage Environment] Production Environment / Testing / PoC
[TiDB Version]
[Reproduction Path] What operations were performed when the issue occurred
[Encountered Issue: Initially suspected to be caused by slow queries, one slow query dragged down the entire cluster, and subsequent non-slow queries also became slow. How to find the statement that actually caused the cluster issue?
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachment: Screenshot/Logs/Monitoring]

| username: zhanggame1 | Original post link

First, check the slow query log and find the slowest queries executed at the time when the issue occurred.

| username: 小龙虾爱大龙虾 | Original post link

While time is an important metric, the execution time of normally functioning SQL can also become prolonged when there are overall system issues. However, certain key metrics such as Total_keys, Process_keys, and Num_cop_tasks remain relatively stable and are not significantly affected by overall system problems. By focusing on these metrics, we can more effectively pinpoint the SQL statements that are truly causing slow queries. Therefore, in slow query log analysis, these stable performance metrics are just as valuable as execution time.

| username: dba远航 | Original post link

When querying slow SQL, the one with the longest runtime at the end is usually the one.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.