Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb和tikv之间网络流量异常增加
[Test Environment for TiDB] Testing
[TiDB Version] 5.4.0
[Reproduction Path]
[Encountered Issue: Abnormal increase in network traffic between TiDB and TiKV, TiDB error log as follows
]
The traffic in the test environment is very low, but suddenly one night the traffic spiked abnormally. By using iftop, it was found that TiKV was sending network traffic to TiDB, with a maximum amount of 700M, occurring every 5 minutes. Since the traffic in the test environment is very small, almost non-existent, and no clues were found in the TiKV and PD logs, the TiDB log is as shown in the image. Please help explain, thank you.
Take a look at the expensive query and slow query, it should be caused by the join query.
The increase in network traffic is most likely due to a large amount of data being transferred between the computing layer and the storage layer. You can focus on investigating the SQL access situation of the cluster at that time.
Use the Dashboard to analyze query statements and check the system heatmap to view read/write hotspots. Analyze the currently accessed databases, tables, and regions to confirm the business.
From the prompt, it seems that there were large DML statements causing the issue. Check the SQL at that point in time.
Is there a large number of query operations?
Has Grafana been deployed? You can check the network monitoring section on the tidb-cluster-node_exporter page.
This is generally caused by TiDB executing large SQL statements.
Use the TOP SQL feature to locate the issue, it is generally caused by SQL.
Is it a large transaction or a large result query again?