How to Handle High CPU Usage Displayed in TiDB 4.0 Monitoring

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TIDB4.0 监控显示CPU Usage占用高如何处理

| username: liuhf-yx

[TiDB Usage Environment] Production Environment
[TiDB Version] v4.03
[Reproduction Path] Normal when the application is not started. After the application starts, CPU usage gradually increases, reaching about 80% in about half a month. It returns to normal after restarting the TiDB service.
Why is the CPU usage so high on the TiDB panel?

| username: DBRE | Original post link

Is the tidb-server or tikv-server higher?
Please provide the tidb-server QPS data and slow SQL data.

| username: liuhf-yx | Original post link

The tidb-server is high.


| username: 裤衩儿飞上天 | Original post link

Capture slow SQL

| username: DBRE | Original post link

Slow SQL and Duration are reflected. First, use the dashboard to analyze and optimize the slow SQL.

| username: liuhf-yx | Original post link

I looked at today’s slow SQL, and they are all SQL executed temporarily to handle issues today.

| username: TiDBer_pkQ5q1l0 | Original post link

First, analyze a slow SQL.

| username: liuhf-yx | Original post link

Today’s slow SQL is a temporary execution of SQL today, not executed daily, but the CPU usage continues to increase.

| username: 裤衩儿飞上天 | Original post link

During CPU growth, use pt-query-digest to analyze the slow logs, with the default sorting. Check the top-ranked ones and prioritize their analysis.

| username: liuhf-yx | Original post link

Okay, I’ll use pt-query-digest to optimize the SQL first and then observe the results.

| username: liuhf-yx | Original post link

All slow queries have been optimized, but the CPU usage has hardly changed and is still around 86%.

| username: Jiawei | Original post link

Is your tidbserver connected to only this one code? Or is there a proxy load balancer in front of it?

| username: liuhf-yx | Original post link

The main business connects to this one TiDB server, and daily usage connects to another server without load balancing.

| username: Jiawei | Original post link

Several approaches:

  1. Optimize slow SQL queries, focusing on key areas.
  2. Add a proxy on top of the TiDB server to help distribute the load.
| username: liuhf-yx | Original post link

Continually optimizing all slow queries to below 600ms, the CPU usage dropped after restarting the TiDB service. Currently, the CPU usage is below 5%, which should be the result of optimizing the slow SQL queries.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.