After upgrading to v7.1.1, latency has significantly increased

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 升级v7.1.1后延迟大大增加

| username: porpoiselxj

[TiDB Usage Environment] Production Environment
[TiDB Version] v7.1.1
[Reproduction Path] Upgraded from v6.1.1 to v7.1.1
[Encountered Problem: Phenomenon and Impact]
The latency data information (99.9% latency data) in the dashboard monitoring overview panel increased from around 60ms before the upgrade to 400ms after the upgrade. Please help investigate the issue.

In the wind-Performance-Overview panel, the KV Request Time By Source and two other metrics have significantly increased, while no major changes have been observed in other indicators. Details are as follows:

| username: 大飞哥online | Original post link

Is there any slow SQL at the corresponding time point?

| username: porpoiselxj | Original post link

There are very few slow SQL queries, just a few individual ones. There shouldn’t be much change before and after the upgrade, so it shouldn’t be caused by slow SQL queries. Looking at the monitoring, the KV Request Time By Source spiked right after the upgrade.

| username: 大飞哥online | Original post link

Please provide the information for the KV/TSO Request OPS metric.

| username: 大飞哥online | Original post link

The Execution Duration graph has increased, indicating that the execution plan of the SQL statement is taking more time.

| username: porpoiselxj | Original post link

After the upgrade, it became smaller.

| username: 大飞哥online | Original post link

What are the top few metrics on the right side of the “KV Request Time By Source” that was posted above?

| username: 大飞哥online | Original post link

The graph shows that it has reached 3 minutes, and the right side at the front represents seconds. Sort them and take a look at the ones in the front.

| username: porpoiselxj | Original post link

  1. kv request total time
  2. Cop-external_Execute
| username: 大飞哥online | Original post link

Cop-external_Execute indicates that the Cop request originates from an internal analyze operation.

show analyze status;
SELECT * FROM information_schema.analyze_status;
SELECT * FROM mysql.analyze_jobs;

Check the analyze information for that period.

| username: zhanggame1 | Original post link

Is the execution plan different? Try analyzing the table.

| username: porpoiselxj | Original post link

Based on your feedback, I specifically looked into the table analysis tasks after the upgrade and found that there wasn’t much increase. Moreover, since the upgrade, the KV Request Time By Source has consistently been high, even though no background table analysis tasks have been detected.

| username: xfworld | Original post link

Observing and collecting more information.

There are still some differences in data processing between major versions, so be cautious when upgrading in a production environment.

| username: 大飞哥online | Original post link

Is the high value for KV Request Time By Source still cop?

| username: 大飞哥online | Original post link

Keep monitoring it.

| username: porpoiselxj | Original post link

“TiDB is an open-source, distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.”

| username: 像风一样的男子 | Original post link

Why is the latency so high? Are there a lot of slow SQL queries?

| username: 大飞哥online | Original post link

The highest time-consuming KV requests are Commit and Prewrite, and they originate from external Commit statements.

| username: 大飞哥online | Original post link

The page displays all external information, which is external to KV.

Check the monitoring information of TiDB during that time period to see if there are any anomalies.

| username: porpoiselxj | Original post link

Could you please specify which metrics to look at?