The cluster got stuck twice, resulting in two batches of slow queries

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 集群卡住两次,出现两批次慢查询

| username: magongyong

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.4.3
[Encountered Problem]
Two batches of slow queries occurred today at 14:04:50 and 14:13, especially the one at 14:04. However, the actual business volume was not high, with QPS less than 30,000. The relevant monitoring is as follows:
Dashboard Slow Query


QPS Monitoring

CPU, memory, IO, and other monitoring are relatively normal, but the following monitoring indicators are abnormal. The official documentation does not provide the meaning of these monitoring indicators, so I do not understand them well.
tikv-details - scheduler-key_mvcc interface

scheduler-txn_heart_neat

[Reproduction Path] Operations performed that led to the problem
Normal business operations

[Problem Phenomenon and Impact]
Many slow queries occurred, and the cluster got stuck twice. It is suspected to be a network issue, but I want to know how to confirm it is a network issue from the monitoring.

[Attachments]

Please provide the version information of each component, such as cdc/tikv, which can be obtained by executing cdc version/tikv-server --version.

| username: xfworld | Original post link

  • Check if there are any anomalies in the status and number of regions.

  • There are quite a few MVCC keys, check if there have been a large number of delete operations recently.

| username: magongyong | Original post link

| username: magongyong | Original post link

I also communicated with the business side, and they said there were no large-scale deletion operations.

| username: xfworld | Original post link

There are empty regions here.

Empty regions are generally caused by a large number of deletions. :cowboy_hat_face:

| username: magongyong | Original post link

There have always been quite a lot of empty regions before, more than 3,000 :joy:

| username: tidb狂热爱好者 | Original post link

We also encountered this issue with TiDB’s JDBC settings. You need to use version 8.0.30 of the JDBC and also use the optimization mode.

| username: tidb狂热爱好者 | Original post link

Let me find the link for you. We struggled with this issue for a long time.

You need to configure this optimization parameter: useConfigs = maxPerformance

| username: tidb狂热爱好者 | Original post link

Here

| username: magongyong | Original post link

Thanks :+1:, I’ll study it.

| username: 近墨者zyl | Original post link

Learning :+1: :raising_hand_man:

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.