The CPU usage of several TiKV machines fluctuates

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv 几台机器的cpu忽高忽低

| username: cy6301567

TiDB has deployed three TiKV instances. Monitoring shows that sometimes the CPU usage of one TiKV machine is much higher than the other two. I don’t know what is causing the issue.

| username: h5n1 | Original post link

There should be a read hotspot. Which version are you using? You can directly find the top SQL with high CPU consumption on a specific TiKV in the dashboard’s top SQL section.

| username: TIDB-Learner | Original post link

  1. Are the three machines configured the same?
  2. Is there a hotspot issue?
  3. Is the region distribution uneven?
| username: TiDBer_jYQINSnf | Original post link

Check the thread CPU to see which thread pool is high.

| username: 舞动梦灵 | Original post link

First, check the monitoring topsql for any abnormal large transaction SQL. Check the processlist to see if there are any long-running SQL queries. It seems that most of the issues are due to either a hotspot read.

| username: tidb菜鸟一只 | Original post link

Directly check the hotspots on the dashboard; the hotspots should be concentrated on this TiKV.

| username: DBAER | Original post link

Check the monitoring region distribution and dashboard traffic analysis, it seems to be a hotspot.

| username: miya | Original post link

Yes, I also suggest checking if there are any large data operations being performed.

| username: terry0219 | Original post link

You can check whether it is a read hotspot or a write hotspot. If it’s a read hotspot, generally you need to optimize the SQL. If it’s a write hotspot, see if you can scatter the hotspot.

| username: Hacker_PtIIxHC1 | Original post link

Check if there are any large SQL queries and the region distribution of the tables where the large SQL queries are located to see if there are any read hotspots. TiDB Dashboard also has a heatmap that you can check.

| username: dba远航 | Original post link

Check what was done at that time; it’s likely a large SQL query.

| username: 小于同学 | Original post link

Check the thread CPU to see which thread pool is high.

| username: zhang_2023 | Original post link

Whether the distribution of hotspot regions is uniform, and the execution time and efficiency of large transactions.

| username: zhaokede | Original post link

There are hotspots or slow SQL.

| username: 小龙虾爱大龙虾 | Original post link

At this time, the best tool to use is Dashboard’s TOP SQL.

| username: cy6301567 | Original post link

Yes, in version 7.5, we found a SQL query that frequently consumes CPU. The index used in the SQL execution is not optimal. Sometimes it filters data based on shop ID, and sometimes based on order ID. Our query conditions are shopId, tid, and oid. The table’s health is only 61.

| username: cy6301567 | Original post link

Okay, thank you. The three TiKV machines have the same configuration, and the data IDs are auto-incremented.

| username: cy6301567 | Original post link

Okay, thank you.

| username: cy6301567 | Original post link

Yes, monitoring has found that there is an SQL query being read very frequently.

| username: 有猫万事足 | Original post link

Is it with group by?
Or just simple multi-table joins?

If it involves group by, consider using TiFlash + MPP. For simple multi-table joins, consider binding the execution plan.