Failed to set root password of TiDB database when I deployed a stand-alone cluster

Note that this question was originally posted on the Chinese TiDB Forum. We copied it here to help new users solve their problems quickly.

Application environment

Test environment

TiDB version

TiDB v6.0.0

Reproduction method

I followed the TiDB quick start guide to deploy a stand-alone cluster

[root@node1:0 ~]# tiup cluster deploy liking v6.0.0 ./topo.yaml --user root -p
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.9.3/tiup-cluster /root/.tiup/components/cluster/v1.9.3/tiup-cluster deploy liking v6.0.0 ./topo.yaml --user root -p
Input SSH password: 

+ Detect CPU Arch
  - Detecting node 192.168.222.11 ... Done
Please confirm your topology:
Cluster type:    tidb
Cluster name:    liking
Cluster version: v6.0.0
Role        Host            Ports                            OS/Arch       Directories
----        ----            -----                            -------       -----------
pd          192.168.222.11  2379/2380                        linux/x86_64  /u01/tidb/deploy/pd-2379,/u01/tidb/data/pd-2379
tikv        192.168.222.11  20160/20180                      linux/x86_64  /u01/tidb/deploy/tikv-20160,/u01/tidb/data/tikv-20160
tikv        192.168.222.11  20161/20181                      linux/x86_64  /u01/tidb/deploy/tikv-20161,/u01/tidb/data/tikv-20161
tikv        192.168.222.11  20162/20182                      linux/x86_64  /u01/tidb/deploy/tikv-20162,/u01/tidb/data/tikv-20162
tidb        192.168.222.11  4000/10080                       linux/x86_64  /u01/tidb/deploy/tidb-4000
tiflash     192.168.222.11  9000/8123/3930/20170/20292/8234  linux/x86_64  /u01/tidb/deploy/tiflash-9000,/u01/tidb/data/tiflash-9000
prometheus  192.168.222.11  9090/12020                       linux/x86_64  /u01/tidb/deploy/prometheus-9090,/u01/tidb/data/prometheus-9090
grafana     192.168.222.11  3000                             linux/x86_64  /u01/tidb/deploy/grafana-3000
Attention:
    1. If the topology is not what you expected, check your yaml file.
    2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y
+ Generate SSH keys ... Done
+ Download TiDB components
  - Download pd:v6.0.0 (linux/amd64) ... Done
  - Download tikv:v6.0.0 (linux/amd64) ... Done
  - Download tidb:v6.0.0 (linux/amd64) ... Done
  - Download tiflash:v6.0.0 (linux/amd64) ... Done
  - Download prometheus:v6.0.0 (linux/amd64) ... Done
  - Download grafana:v6.0.0 (linux/amd64) ... Done
  - Download node_exporter: (linux/amd64) ... Done
  - Download blackbox_exporter: (linux/amd64) ... Done
+ Initialize target host environments
  - Prepare 192.168.222.11:22 ... Done
+ Deploy TiDB instance
  - Copy pd -> 192.168.222.11 ... Done
  - Copy tikv -> 192.168.222.11 ... Done
  - Copy tikv -> 192.168.222.11 ... Done
  - Copy tikv -> 192.168.222.11 ... Done
  - Copy tidb -> 192.168.222.11 ... Done
  - Copy tiflash -> 192.168.222.11 ... Done
  - Copy prometheus -> 192.168.222.11 ... Done
  - Copy grafana -> 192.168.222.11 ... Done
  - Deploy node_exporter -> 192.168.222.11 ... Done
  - Deploy blackbox_exporter -> 192.168.222.11 ... Done
+ Copy certificate to remote host
+ Init instance configs
  - Generate config pd -> 192.168.222.11:2379 ... Done
  - Generate config tikv -> 192.168.222.11:20160 ... Done
  - Generate config tikv -> 192.168.222.11:20161 ... Done
  - Generate config tikv -> 192.168.222.11:20162 ... Done
  - Generate config tidb -> 192.168.222.11:4000 ... Done
  - Generate config tiflash -> 192.168.222.11:9000 ... Done
  - Generate config prometheus -> 192.168.222.11:9090 ... Done
  - Generate config grafana -> 192.168.222.11:3000 ... Done
+ Init monitor configs
  - Generate config node_exporter -> 192.168.222.11 ... Done
  - Generate config blackbox_exporter -> 192.168.222.11 ... Done
+ Check status
Enabling component pd
        Enabling instance 192.168.222.11:2379
        Enable instance 192.168.222.11:2379 success
Enabling component tikv
        Enabling instance 192.168.222.11:20162
        Enabling instance 192.168.222.11:20160
        Enabling instance 192.168.222.11:20161
        Enable instance 192.168.222.11:20160 success
        Enable instance 192.168.222.11:20162 success
        Enable instance 192.168.222.11:20161 success
Enabling component tidb
        Enabling instance 192.168.222.11:4000
        Enable instance 192.168.222.11:4000 success
Enabling component tiflash
        Enabling instance 192.168.222.11:9000
        Enable instance 192.168.222.11:9000 success
Enabling component prometheus
        Enabling instance 192.168.222.11:9090
        Enable instance 192.168.222.11:9090 success
Enabling component grafana
        Enabling instance 192.168.222.11:3000
        Enable instance 192.168.222.11:3000 success
Enabling component node_exporter
        Enabling instance 192.168.222.11
        Enable 192.168.222.11 success
Enabling component blackbox_exporter
        Enabling instance 192.168.222.11
        Enable 192.168.222.11 success
Cluster `liking` deployed successfully, you can start it with command: `tiup cluster start liking --init`

Problem

The first time I started the cluster, I got an error:

[root@node1:0 ~]# tiup cluster start liking --init
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.9.3/tiup-cluster /root/.tiup/components/cluster/v1.9.3/tiup-cluster start liking --init
Starting cluster liking...
+ [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/liking/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/liking/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [Parallel] - UserSSH: user=tidb, host=192.168.222.11+ [ Serial ] - StartCluster
Starting component pd
        Starting instance 192.168.222.11:2379Start instance 192.168.222.11:2379 success
Starting component tikv
        Starting instance 192.168.222.11:20162
        Starting instance 192.168.222.11:20160
        Starting instance 192.168.222.11:20161Start instance 192.168.222.11:20160 success
        Start instance 192.168.222.11:20162 success
        Start instance 192.168.222.11:20161 success
Starting component tidb
        Starting instance 192.168.222.11:4000Start instance 192.168.222.11:4000 success
Starting component tiflash
        Starting instance 192.168.222.11:9000Start instance 192.168.222.11:9000 success
Starting component prometheus
        Starting instance 192.168.222.11:9090Start instance 192.168.222.11:9090 success
Starting component grafana
        Starting instance 192.168.222.11:3000Start instance 192.168.222.11:3000 success
Starting component node_exporter
        Starting instance 192.168.222.11Start 192.168.222.11 success
Starting component blackbox_exporter
        Starting instance 192.168.222.11Start 192.168.222.11 success
+ [ Serial ] - UpdateTopology: cluster=liking
Started cluster `liking` successfully
Failed to set root password of TiDB database to 'G^174F*P!3t2sz&Wd5'

Error: dial tcp 192.168.222.11:4000: connect: connection refused

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2022-04-08-15-01-41.log.

It seemed that the port was not available. But in fact, port 4000 was not used, the host memory had hundreds of megabytes available, and the CPU had enough space. I had changed maxsession to 20 according to the TiDB doc. What was the reason

It’s probably useless to read the TiUP log. You can only see “connection refused”.

Check out the tidb-server log on its server to see why you can’t start up the cluster or why you don’t have the listen port. Then solve the problem according to the error message.

Thanks. The tidb-server log is as follows:

[2022/04/11 15:52:09.114 +08:00] [FATAL] [main.go:691] ["failed to create the server"] [error="failed to cleanup stale Unix socket file /tmp/tidb-400
0.sock: dial unix /tmp/tidb-4000.sock: connect: permission denied"] [stack="main.createServer\
\t/home/jenkins/agent/workspace/build-common/go/src/gi
thub.com/pingcap/tidb/tidb-server/main.go:691\
main.main\
\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/mai
n.go:205\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:250"] [stack="main.createServer\
\t/home/jenkins/agent/workspace/build-common/go/src/gith
ub.com/pingcap/tidb/tidb-server/main.go:691\
main.main\
\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.
go:205\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:250"]

The reason is you don’t have permission.

Execute the chmod -R 777 /tmp command.

Or you delete the sock file manually and then start the cluster up. If the problem reoccurrs, it means that the permission of the /tmp folder is not correctly set. You need to execute the command above and add your permission.

You can check the permission of your machine first.

ll /tmp/tidb-4000.sock

I’ve checked my permission. I left a /tmp/tidb-4000.sock file with root privilege in my deployment test operation. After I deleted it, I started the tidb server successfully.

I saw a /tmp/tidb-4000.sock file under tmp. I deleted this file and ran start - R tidb again. It worked:

[root@node1:0 ~]# tiup cluster display liking      
tiup is checking updates for component cluster ...
Starting component `cluster`: /root/.tiup/components/cluster/v1.9.3/tiup-cluster /root/.tiup/components/cluster/v1.9.3/tiup-cluster display liking
Cluster type:       tidb
Cluster name:       liking
Cluster version:    v6.0.0
Deploy user:        tidb
SSH type:           builtin
Dashboard URL:      http://192.168.222.11:2379/dashboard
ID                    Role        Host            Ports                            OS/Arch       Status   Data Dir                        Deploy Dir
--                    ----        ----            -----                            -------       ------   --------                        ----------
192.168.222.11:3000   grafana     192.168.222.11  3000                             linux/x86_64  Up       -                               /u01/tidb/deploy/grafana-3000
192.168.222.11:2379   pd          192.168.222.11  2379/2380                        linux/x86_64  Up|L|UI  /u01/tidb/data/pd-2379          /u01/tidb/deploy/pd-2379
192.168.222.11:9090   prometheus  192.168.222.11  9090/12020                       linux/x86_64  Up       /u01/tidb/data/prometheus-9090  /u01/tidb/deploy/prometheus-9090
192.168.222.11:4000   tidb        192.168.222.11  4000/10080                       linux/x86_64  Up       -                               /u01/tidb/deploy/tidb-4000
192.168.222.11:9000   tiflash     192.168.222.11  9000/8123/3930/20170/20292/8234  linux/x86_64  Up       /u01/tidb/data/tiflash-9000     /u01/tidb/deploy/tiflash-9000
192.168.222.11:20160  tikv        192.168.222.11  20160/20180                      linux/x86_64  Up       /u01/tidb/data/tikv-20160       /u01/tidb/deploy/tikv-20160
192.168.222.11:20161  tikv        192.168.222.11  20161/20181                      linux/x86_64  Up       /u01/tidb/data/tikv-20161       /u01/tidb/deploy/tikv-20161
192.168.222.11:20162  tikv        192.168.222.11  20162/20182                      linux/x86_64  Up       /u01/tidb/data/tikv-20162       /u01/tidb/deploy/tikv-20162
Total nodes: 8