Error When Scaling Down Nodes Using TiUP

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 通过tiup缩容节点报错

| username: TiDBer_hjacbdMD

[TiDB Usage Environment]

Production

[TiDB Version]

v4.0.5

[Encountered Issue]

An error occurred when executing the following command.

Command:

tiup cluster scale-in jichujiagou-bigdata-tidb --force --node 10.20.34.27:4000

Error:

Process exited with status 1
transfer from /home/tidb/.tiup/storage/cluster/clusters/jichujiagou-bigdata-tidb/config-cache/tikv-172.17.144.143-20160.service to /tmp/tikv_89549722-c523-47bf-b596-d769900441e5.service failed
github.com/pingcap/tiup/pkg/cluster/spec.(*BaseInstance).InitConfig
github.com/pingcap/tiup@/pkg/cluster/spec/instance.go:159
github.com/pingcap/tiup/pkg/cluster/spec.(*TiKVInstance).InitConfig
github.com/pingcap/tiup@/pkg/cluster/spec/tikv.go:173
github.com/pingcap/tiup/pkg/cluster/task.(*InitConfig).Execute
github.com/pingcap/tiup@/pkg/cluster/task/init_config.go:49
github.com/pingcap/tiup/pkg/cluster/task.(*Serial).Execute
github.com/pingcap/tiup@/pkg/cluster/task/task.go:189
github.com/pingcap/tiup/pkg/cluster/task.(*Parallel).Execute.func1
github.com/pingcap/tiup@/pkg/cluster/task/task.go:242
runtime.goexit
runtime/asm_amd64.s:1357
init config failed: 172.17.144.143:20160

[Reproduction Path]

[Issue Phenomenon and Impact]

Uncertain if scaling in will affect the existing cluster.

| username: songxuecheng | Original post link

What is the status of this TiKV?

| username: h5n1 | Original post link

Please post the result of tiup cluster display. Is the file /home/tidb/.tiup/storage/cluster/clusters/jichujiagou-bigdata-tidb/config-cache/tikv-172.17.144.143-20160.service still there?

| username: TiDBer_hjacbdMD | Original post link

| username: TiDBer_hjacbdMD | Original post link

The node 10.20.34.27:4000 is no longer visible in the topology, and another node’s machine has been reclaimed, so a scale-in operation is needed. However, I’m worried there might be issues, so I haven’t taken any action for now.

| username: TiDBer_hjacbdMD | Original post link

There is this file.

| username: Lucien-卢西恩 | Original post link

Hello,

From the logs you provided, it seems that the TiKV instance was not cleaned up. If you are unsure, you can use pd-ctl to check the status of the Region replicas in the cluster to see if there are any Regions with fewer than 2 replicas. You can refer to this document to check the Region status:

You can also check if the Leader status is normal:

| username: system | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.