After expanding a TiKV node, executing DDL is very slow

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv扩容一个节点后执行ddl很慢

| username: TiDBer_yBunUeUc

Problem description: After expanding the cluster with an additional KV node, truncating an empty table executes very slowly. After removing that node, executing DDL operations with the original three-node cluster is very fast. Seeking clarification.

| username: tidb狂热爱好者 | Original post link

Take a look at the network. I see that you are spanning across two sub-IPs. TiKV works best on physical machines. If it is a physical machine, 99.1 and 88.1 mean that it is spanning across switches. It is best to be under the same switch.

| username: ShawnYan | Original post link

Also, after this KV node was added, were all the regions rebalanced?

| username: TiDBer_yBunUeUc | Original post link

The region has enabled the automatic balancing parameter.

| username: 哈喽沃德 | Original post link

Is the performance of the new node poor?

| username: TiDBer_yBunUeUc | Original post link

The performance of the newly added node is the same as the previous three nodes.

| username: 路在何chu | Original post link

Was the execution done after the region balance was completed, or was it executed without the region being balanced?

| username: 路在何chu | Original post link

It usually takes at least a day for the region to balance after expansion.

| username: TiDBer_yBunUeUc | Original post link

Running now

| username: 胡杨树旁 | Original post link

Check the TiDB logs corresponding to the DDL owner node to see if there are any hints.

| username: 像风一样的男子 | Original post link

Is your cluster lagging? Check the latency on the dashboard monitoring.

| username: TiDBer_yBunUeUc | Original post link

Which specific metrics should I look at? Please clarify, thanks.

| username: 像风一样的男子 | Original post link

Check the CPU and memory in the dashboard, and the latency. Then look at the KV metrics in Grafana to see if the I/O latency is normally in the tens of milliseconds.


| username: TiDBer_yBunUeUc | Original post link

I checked the TiDB logs, and they are all info prompts without any obvious information.

| username: 像风一样的男子 | Original post link

Check whether the regions in the KV nodes are balanced. Also, verify if the region scheduling in PD is normal.

| username: 有猫万事足 | Original post link

You can check if there are any issues with NTP.

Also, what is the ping result from 63.120 to 52.67?

| username: dba远航 | Original post link

Check the relevant performance parameters of this TIKV node.

| username: tidb菜鸟一只 | Original post link

It should be a network issue.

| username: andone | Original post link

Is the machine configuration consistent? Is the cluster still in the expansion period?

| username: tidb狂热爱好者 | Original post link

He has a problem with his network.