High CPU Load in TiDB

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tidb cpu负载高

| username: Mirror

[TiDB Usage Environment] Production Environment
[TiDB Version] 7.5.0
[Reproduction Path] Operations performed that led to the issue
[Encountered Issue: Problem Phenomenon and Impact]: Our current scenario involves migrating from MySQL to TiDB. This project will eventually phase out MySQL. However, there is a persistent data delay that cannot catch up. Upon checking the dashboard and monitoring, we found occasional inserts taking more than 10 seconds. The CPU load is high, while other IO loads are around 60%.
[Resource Configuration] Go to TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]

| username: TiDBer_jYQINSnf | Original post link

Take a look at the kv-request, see if there are any particularly high ones. Then specifically check the monitoring of TiKV to see if the CPU is maxed out.

| username: 小龙虾爱大龙虾 | Original post link

Find a few slow SQL queries to check.

| username: kelvin | Original post link

You need to test the slow SQL, right?

| username: danghuagood | Original post link

Check to see if the CPU allocation for TiKV is too small.

| username: 有猫万事足 | Original post link

Could it be a hotspot issue? Are all the writes concentrated on one region?

| username: tidb菜鸟一只 | Original post link

It is most likely a hotspot issue. Check if the corresponding slow table is using the auto-increment primary key from MySQL. The table structure might not have been optimized when migrated to TiDB.

| username: WinterLiu | Original post link

The experts upstairs have analyzed it very well. It’s either a data hotspot issue or the configuration is really insufficient.

| username: changpeng75 | Original post link

The load on the 3 TiKV nodes is basically consistent, so it’s unlikely to be a hotspot issue, right? It could be that the machine performance is insufficient, and the performance consumption of the two TiDB Servers is also quite significant.
First, try separating the TiDB Server from the TiKV. Find three other machines and migrate the TiDB and PD there.

| username: Soysauce520 | Original post link

TiDB parsing takes 76ms, what is the machine configuration?

| username: yulei7633 | Original post link

Check if the statistics at that time are consuming a lot of resources.

| username: redgame | Original post link

Find topsql

| username: Mirror | Original post link

For machine 36C, what is a reasonable normal value for this metric?

| username: 考试没答案 | Original post link

First, analyze what the current busy business is. The tables and data volume involved. If there is no significant business, the CPU wouldn’t be that high.

| username: 考试没答案 | Original post link

There is no such thing as reasonable or unreasonable metrics. Sometimes when our business is busy, the CPU is at 100%. We don’t handle it because resources are limited.

| username: cassblanca | Original post link

It seems that there is a mixed deployment situation and NUMA binding is not configured. Check the region distribution to see if there are any hotspot issues.

| username: Soysauce520 | Original post link

The resources should be sufficient and not yet at the bottleneck. How many inserts are done at once? Seeing that the select statement is at the millisecond level, it might be that there are too many values.

| username: TIDB-Learner | Original post link

Is TiDB currently being used in business? If it is only used as a target database for data migration, the memory is normal but the CPU is high. It is very likely a problem with the TiDB tables. 1. Unreasonable configuration. 2. Index issues. 3. High complexity of statements. 4. Frequent disk I/O operations.

| username: TiDBer_5Vo9nD1u | Original post link

Check the execution plan of the slow SQL and optimize it.

| username: 小于同学 | Original post link

Find a few slow SQL queries to check.