Installed TiDB in the test environment, found high latency, and many requests timed out

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 测试环境安装了TIDB,发现延迟很高,很多请求都会超时

| username: TiDBer_8mpI7QNz

[TiDB Usage Environment] Production Environment
[TiDB Version]
[Encountered Problem: Network timeout occurs when multiple people call, checking slow SQL reveals send timeout, running it separately is not slow]
[Resource Configuration] Enter TiDB Dashboard - Cluster Info - Hosts and take a screenshot of this page
[Attachments: Screenshots/Logs/Monitoring]


Image

Image

After making several consecutive requests to the interface, the monitoring of a node on Alibaba Cloud looks like this. How can it be optimized? Is 500M/S on the intranet normal?

Image

Image

| username: 托马斯滑板鞋 | Original post link

Here is the result of the trace (SQL statement). It looks like the CPU is overloaded.

| username: tidb菜鸟一只 | Original post link

Using multiple machines on Alibaba Cloud, it is best if these machines are under the same switch and use internal IPs to install TiDB.

| username: TiDBer_8mpI7QNz | Original post link

All are intranet.

| username: DBAER | Original post link

Normally, a VPC shouldn’t have any network issues, right?

| username: TiDBer_8mpI7QNz | Original post link

A single SQL query responds very quickly, but when applied to the test environment, it results in a network timeout.

| username: TiDBer_8mpI7QNz | Original post link

Boss, how do you check if the network has any issues?

| username: 托马斯滑板鞋 | Original post link

What do you mean? Two different environments? Or is it that the application in the test environment can’t connect?

| username: 托马斯滑板鞋 | Original post link

Could it be that the application concurrency is set too high? Check the top SQL on the dashboard to see if there are many.

| username: TiDBer_8mpI7QNz | Original post link

Using production environment data to simulate testing in the test environment, MySQL can be accessed normally, but after setting up TiDB, even normal access is not possible, everything times out.

| username: 托马斯滑板鞋 | Original post link

Can that single SQL statement be executed in TiDB?

| username: TiDBer_8mpI7QNz | Original post link

A single user page test, no concurrency, MySQL is functioning normally.

| username: 托马斯滑板鞋 | Original post link

Output a trace + SQL

| username: TiDBer_8mpI7QNz | Original post link

Single requests and responses are both very fast.

| username: 托马斯滑板鞋 | Original post link

In TiDB, a single SQL query is fine, but it doesn’t work well under high concurrency?

| username: TiDBer_8mpI7QNz | Original post link

The current situation is like this: as soon as there are many requests, it immediately times out. It’s impossible that the 3 servers we bought for 50,000 to deploy TiDB can’t outperform MySQL, right?

| username: 托马斯滑板鞋 | Original post link

Looking at your second screenshot, I suspect that your host’s public network bandwidth is too small (result set returned to the client). You can try adding LIMIT 1 and then stress test it again.

| username: TiDBer_8mpI7QNz | Original post link

All connections to TiDB are on the internal network, so there are no issues with external network bandwidth.

| username: 托马斯滑板鞋 | Original post link

So 500Mb/s is the limit of the internal network bandwidth. You can open a ticket to ask about the internal network bandwidth limit of Alibaba Cloud. Additionally, you can modify the application by adding a “limit 1” to the SQL query to reduce the result set and then perform a stress test to see if it helps.

| username: DBAER | Original post link

The longest time-consuming part for the image is sending the result to the client.