Simulating Deployment of Production Environment Cluster on a Single Machine Error: ssh_stderr: Failed to enable unit: Unit file pd-2379.service does not exist

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 在单机上模拟部署生产环境集群报错:ssh_stderr: Failed to enable unit: Unit file pd-2379.service does not exist

| username: TiDBer_oHpB1az2

OS: Fedora Linux 38 (Workstation Edition) x86_64
Kernel: 6.3.7-200.fc38.x86_64

$ tiup cluster deploy tidb_local v7.1.0 ./topo.yaml --user root -p
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/user/.tiup/components/cluster/v1.12.3/tiup-cluster deploy tidb_local v7.1.0 ./topo.yaml --user root -p
Input SSH password: 

+ Detect CPU Arch Name
  - Detecting node 172.21.117.108 Arch info ... Done

+ Detect CPU OS Name
  - Detecting node 172.21.117.108 OS info ... Done
Please confirm your topology:
Cluster type:    tidb
Cluster name:    tidb_local
Cluster version: v7.1.0
Role        Host            Ports                            OS/Arch       Directories
----        ----            -----                            -------       -----------
pd          172.21.117.108  2379/2380                        linux/x86_64  /data/tidb/deploy/pd-2379,/data/tidb/data/pd-2379
tikv        172.21.117.108  20160/20180                      linux/x86_64  /data/tidb/deploy/tikv-20160,/data/tidb/data/tikv-20160
tikv        172.21.117.108  20161/20181                      linux/x86_64  /data/tidb/deploy/tikv-20161,/data/tidb/data/tikv-20161
tikv        172.21.117.108  20162/20182                      linux/x86_64  /data/tidb/deploy/tikv-20162,/data/tidb/data/tikv-20162
tidb        172.21.117.108  4000/10080                       linux/x86_64  /data/tidb/deploy/tidb-4000
tiflash     172.21.117.108  9000/8123/3930/20170/20292/8234  linux/x86_64  /data/tidb/deploy/tiflash-9000,/data/tidb/data/tiflash-9000
prometheus  172.21.117.108  9090/12020                       linux/x86_64  /data/tidb/deploy/prometheus-9090,/data/tidb/data/prometheus-9090
grafana     172.21.117.108  3000                             linux/x86_64  /data/tidb/deploy/grafana-3000
Attention:
    1. If the topology is not what you expected, check your yaml file.
    2. Please confirm there is no port/directory conflicts in same host.
Do you want to continue? [y/N]: (default=N) y
+ Generate SSH keys ... Done
+ Download TiDB components
  - Download pd:v7.1.0 (linux/amd64) ... Done
  - Download tikv:v7.1.0 (linux/amd64) ... Done
  - Download tidb:v7.1.0 (linux/amd64) ... Done
  - Download tiflash:v7.1.0 (linux/amd64) ... Done
  - Download prometheus:v7.1.0 (linux/amd64) ... Done
  - Download grafana:v7.1.0 (linux/amd64) ... Done
  - Download node_exporter: (linux/amd64) ... Done
  - Download blackbox_exporter: (linux/amd64) ... Done
+ Initialize target host environments
  - Prepare 172.21.117.108:22 ... Done
+ Deploy TiDB instance
  - Copy pd -> 172.21.117.108 ... Done
  - Copy tikv -> 172.21.117.108 ... Done
  - Copy tikv -> 172.21.117.108 ... Done
  - Copy tikv -> 172.21.117.108 ... Done
  - Copy tidb -> 172.21.117.108 ... Done
  - Copy tiflash -> 172.21.117.108 ... Done
  - Copy prometheus -> 172.21.117.108 ... Done
  - Copy grafana -> 172.21.117.108 ... Done
  - Deploy node_exporter -> 172.21.117.108 ... Done
  - Deploy blackbox_exporter -> 172.21.117.108 ... Done
+ Copy certificate to remote host
+ Init instance configs
  - Generate config pd -> 172.21.117.108:2379 ... Done
  - Generate config tikv -> 172.21.117.108:20160 ... Done
  - Generate config tikv -> 172.21.117.108:20161 ... Done
  - Generate config tikv -> 172.21.117.108:20162 ... Done
  - Generate config tidb -> 172.21.117.108:4000 ... Done
  - Generate config tiflash -> 172.21.117.108:9000 ... Done
  - Generate config prometheus -> 172.21.117.108:9090 ... Done
  - Generate config grafana -> 172.21.117.108:3000 ... Done
+ Init monitor configs
  - Generate config node_exporter -> 172.21.117.108 ... Done
  - Generate config blackbox_exporter -> 172.21.117.108 ... Done
Enabling component pd
	Enabling instance 172.21.117.108:2379
Failed to enable unit: Unit file pd-2379.service does not exist.


Error: failed to enable/disable pd: failed to enable: 172.21.117.108 pd-2379.service, please check the instance's log(/data/tidb/deploy/pd-2379/log) for more detail.: executor.ssh.execute_failed: Failed to execute command over SSH for 'tidb@172.21.117.108:22' {ssh_stderr: Failed to enable unit: Unit file pd-2379.service does not exist.
, ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/bin:/sbin:/usr/bin:/usr/sbin /usr/bin/sudo -H bash -c "systemctl daemon-reload && systemctl enable pd-2379.service"}, cause: Process exited with status 1
| username: zhanggame1 | Original post link

Take a look at the logs in /data/tidb/deploy/pd-2379/log to see if there is any other content.

| username: zhanggame1 | Original post link

Deployment recommendations can be found in the official documentation:
Using TiUP to Deploy a TiDB Cluster | PingCAP Documentation Center
Perform a check and repair before deployment to see if there are any issues:

tiup cluster check ./topology.yaml --apply --user root -p
| username: redgame | Original post link

There is an issue with the SSH service; I suggest reinstalling SSH.

| username: Anna | Original post link

It looks like the SSH connection failed.

| username: TiDBer_oHpB1az2 | Original post link

$ tiup cluster check ./topo.yaml --apply --user root -p
tiup is checking updates for component cluster …
Starting component cluster: /home/user/.tiup/components/cluster/v1.12.3/tiup-cluster check ./topo.yaml --apply --user root -p
Input SSH password:

  • Detect CPU Arch Name

    • Detecting node 172.21.117.108 Arch info … Done
  • Detect CPU OS Name

    • Detecting node 172.21.117.108 OS info … Done
  • Download necessary tools

    • Downloading check tools for linux/amd64 … Done
  • Collect basic system information

  • Collect basic system information

    • Getting system info of 172.21.117.108:22 … Done
  • Check time zone

    • Checking node 172.21.117.108 … Done
  • Check system requirements

  • Check system requirements

  • Check system requirements

  • Check system requirements

    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
    • Checking node 172.21.117.108 … Done
  • Cleanup check files

    • Cleanup check files on 172.21.117.108:22 … Done
      Node Check Result Message

172.21.117.108 sysctl Fail will try to set ‘net.core.somaxconn = 32768’
172.21.117.108 sysctl Fail will try to set ‘net.ipv4.tcp_syncookies = 0’
172.21.117.108 sysctl Fail will try to set ‘vm.swappiness = 0’
172.21.117.108 thp Fail will try to disable THP, please check again after reboot
172.21.117.108 service Fail service irqbalance not found, should be installed and started
172.21.117.108 exist Fail /etc/systemd/system/tikv-20160.service already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/tiflash-9000.service already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/tikv-20161.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/prometheus-9090 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/tikv-20160 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20160/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20162/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20161/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tidb-4000 already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/prometheus-9090.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20160 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/tikv-20161 already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/tikv-20162.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/tiflash-9000 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/pd-2379 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tiflash-9000 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/pd-2379/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tiflash-9000/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/prometheus-9090 already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/tidb-4000.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/prometheus-9090/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20161 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/pd-2379 already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/pd-2379.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/grafana-3000 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tidb-4000/log already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/grafana-3000/log already exists, auto fixing not supported
172.21.117.108 exist Fail /etc/systemd/system/grafana-3000.service already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/deploy/tikv-20162 already exists, auto fixing not supported
172.21.117.108 exist Fail /data/tidb/data/tikv-20162 already exists, auto fixing not supported
172.21.117.108 memory Pass memory size is 32768MB
172.21.117.108 limits Fail will try to set ‘tidb soft nofile 1000000’
172.21.117.108 limits Fail will try to set ‘tidb hard nofile 1000000’
172.21.117.108 limits Fail will try to set ‘tidb soft stack 10240’
172.21.117.108 selinux Fail will try to disable SELinux, reboot might be needed
172.21.117.108 os-version Fail os vendor fedora not supported, auto fixing not supported
172.21.117.108 command Fail numactl not usable, bash: numactl: command not found, auto fixing not supported
172.21.117.108 listening-port Fail port 3000 is already in use, auto fixing not supported
172.21.117.108 disk Fail mount point /data does not have ‘nodelalloc’ option set, auto fixing not supported
172.21.117.108 disk Fail multiple components tikv:/data/tidb/data/tikv-20160,tikv:/data/tidb/data/tikv-20161,tikv:/data/tidb/data/tikv-20162,tiflash:/data/tidb/data/tiflash-9000 are using the same partition 172.21.117.108:/data as data dir, auto fixing not supported
172.21.117.108 disk Warn mount point /data does not have ‘noatime’ option set, auto fixing not supported
172.21.117.108 cpu-cores Pass number of CPU cores / threads: 16
172.21.117.108 cpu-governor Fail CPU frequency governor is powersave, should use performance, auto fixing not supported
172.21.117.108 swap Warn will try to disable swap, please also check /etc/fstab manually
172.21.117.108 network Pass network speed of enp2s0 is 10000MB
172.21.117.108 network Pass network speed of veth8e29ecf is 10000MB
172.21.117.108 network Pass network speed of veth919b5c7 is 10000MB
172.21.117.108 network Pass network speed of vethcb63708 is 10000MB
172.21.117.108 network Pass network speed of docker0 is 10000MB
172.21.117.108 network Pass network speed of eno1 is 1000MB

  • Try to apply changes to fix failed checks
  • Try to apply changes to fix failed checks
  • Try to apply changes to fix failed checks
    • Applying changes on 172.21.117.108 … Done
| username: zhanggame1 | Original post link

Haven’t installed it yet? Try destroying it first and then reinstalling.

| username: xingzhenxiang | Original post link

Try reinstalling it.

| username: TiDBer_oHpB1az2 | Original post link

@zhanggame1 After testing, tiup can indeed achieve one-click deployment on CentOS. But it doesn’t work on Fedora. Looking forward to TiDB improving the deployment and operation experience. Thank you.

| username: zhanggame1 | Original post link

I tested it with the latest Ubuntu 22.04.2 LTS and had no issues. CentOS 7 will soon stop being supported, so we plan to deploy our production services on Ubuntu.

| username: TiDBer_oHpB1az2 | Original post link

Could you please test Fedora 38?

| username: TiDBer_oHpB1az2 | Original post link

Additionally, how is the support for CentOS 9 Stream? I am about to test it.

| username: buptzhoutian | Original post link

It’s not an SSH connection issue (the SSH tunnel is fine, and the previous file transfers were successful). The error occurs when executing this command via SSH. You can check the status of this systemd unit on the machine.

| username: TiDBer_ZevpzaDp | Original post link

How can I obtain the SSH password?

| username: TiDBer_小阿飞 | Original post link

Does this operating system support it?

| username: Billmay表妹 | Original post link

You can check this document.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.