Error Occurred When Changing TiDB Version During Deployment

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 部署tidb更换版本时出现报错

| username: TiDBer_ZbEmSPmr

[Test Environment for TiDB] Testing TiDB, single machine simulating cluster deployment
[TiDB Version] v7.1.0
[Reproduction Path] Purchase a CentOS 7 system ECS, install according to the official documentation, documentation link: TiDB 数据库快速上手指南 | PingCAP 文档中心
Failed to install TiFlash, error reported that port 9000 is occupied. Checking the configuration file, the default port is 3930, using netstat -tln | grep 3930 did not find it occupied. Subsequently, tiup cluster clean was used to delete the cluster, hoping to try another version. But an error occurred during deletion, and the log file content was empty at the detailed log path indicated by the error prompt. ECS has already added a security group of 1/65535 to the local machine.
[Encountered Issues: Phenomenon and Impact]

  1. Failed to install TiFlash on version 7.1.0.

  2. Failed to clean the cluster installed with version 7.1.0.

  3. When deploying version 5.4.1 cluster on the same machine, it reported that the port is still occupied, possibly due to the failure in step 2, but checking the reported port also showed it was not occupied. Impact: Unable to successfully complete the installation and deployment of TiDB.
    [Resource Configuration]
    CentOS 7.9 64C128G 1TB SSD 100 Mbps
    [Attachments: Screenshots/Logs/Monitoring]

  4. The error is similar to this, but checking the port indeed shows it is not occupied: centos8 tidb v6.6.0 集群安装后 tiflash 无法正常启动 - #4,来自 裤衩儿飞上天 - TiDB 的问答社区, 启动集群tiflash启动失败 - TiDB 的问答社区

  5. Error: failed to stop grafana: failed to stop: grafana-3000.service, please check the instance’s log(/tidb-deploy/grafana-3000/log) for more detail.: executor.ssh.execute_failed: Failed to execute command over SSH for ‘’ {ssh_stderr: , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c “systemctl daemon-reload && systemctl stop grafana-3000.service”}, cause: dial tcp i/o timeout.

  6. tiup cluster deploy tidb-test-cluster1 v5.4.1 /root/tidb/topo.yaml --user root -p

tiup is checking updates for component cluster …

Starting component cluster: /root/.tiup/components/cluster/v1.12.3/tiup-cluster deploy tidb-test-cluster1 v5.4.1 /root/tidb/topo.yaml --user root -p

Error: Deploy port conflicts to an existing cluster (spec.deploy.port_conflict)

The port you specified in the topology file is:

Port: 2379

Component: pd

It conflicts to a port in the existing cluster:

Existing Cluster Name: tidb-test-cluster

Existing Port: 2379

Existing Component: pd

Please change to use another port or another host.

| username: tidb菜鸟一只 | Original post link

How much memory do you have? Check if there is any spare memory left after setting up the cluster and before deploying TiFlash?

| username: TiDBer_ZbEmSPmr | Original post link

128G all in TiDB.

| username: tidb菜鸟一只 | Original post link

Use tiup cluster list to check if there are any clusters left. If there are, directly remove them using destroy instead of clean.
tiup cluster destroy <cluster-name>

| username: Anna | Original post link

Try using destroy.

| username: zhanggame1 | Original post link

Using destroy, I haven’t tested clean
destroy is very clean

| username: TiDBer_ZbEmSPmr | Original post link

The errors for destroy and clean are the same here, --force works.

| username: TiDBer_ZbEmSPmr | Original post link

Updated to 5.4.1 with the same deployment method and started successfully. Is there a problem with 7.1.0?

| username: yilong | Original post link

Use netstat to check if the port is occupied, or try changing the port.

| username: redgame | Original post link

You can try modifying the cluster configuration file to assign ports to the corresponding components that do not conflict with the ports of the already running components.