How to Decommission a Specific Node on a Machine in a Mixed Deployment of TiDB 3.0

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiDB3.0混合部署情况下,如何只下线机器上的某个节点

| username: WalterLYU

Background: TiKV and TiDB are deployed together on instance 192.168.0.1
Requirement: Need to decommission the TiDB node on instance 192.168.0.1
Issue: According to the official documentation (使用 TiDB Ansible 扩容缩容 TiDB 集群 | PingCAP 归档文档站), the precautions state “In the following scale-down example, the removed node does not have other services deployed together; if other services are deployed together, you cannot operate as follows.”

Is there any other way to achieve this requirement? (PS: Upgrading the TiDB version is not considered for now)
Or can I manually log in to the 192.168.0.1 node and then kill tidb_server_pid?

| username: 啦啦啦啦啦 | Original post link

Try using -l to specify the machine and -t to specify the service, but it’s best to test it in a test environment first. Version 3.0 is too old, I’ve almost forgotten about it.

| username: xingzhenxiang | Original post link

Hurry up and upgrade.

| username: wangccsy | Original post link

Upgrade to the new version.

| username: kkpeter | Original post link

You can directly kill this TiDB process.

| username: Kongdom | Original post link

This only stops the node from running; it does not remove the node from the cluster. You still need to follow the official documentation.

| username: 哈喽沃德 | Original post link

If you want to decommission a TiDB node, you can use TiUP to complete this operation. First, you need to delete the information of the node in the inventory.ini configuration file, and then use the following command to start the TiDB cluster:

tiup cluster scale-in <cluster_name> -N <node_name>

Here, <cluster_name> is the name of the cluster, and <node_name> is the name of the node to be decommissioned.

| username: Kongdom | Original post link

Version 3.0.6 does not have tiup yet.

| username: TiDBer_vfJBUcxl | Original post link

Upgrade version

| username: Jellybean | Original post link

The 3.0 version does not use tiup yet, it uses the ansible tool.

If you want to operate the old cluster, you can use ansible to manage cluster nodes, including scaling, starting/stopping, upgrading, etc.

If there are only a few nodes, you can manually execute commands to complete the tasks. For the issue raised by the original poster, you can do the following:

  1. Adjust the front-end load balancing, remove the upstream of the TiDB server node to be taken offline, and move the business traffic away.
  2. Use ansible to scale down, or manually stop with systemctl stop and then clean up the directory to take it offline.
  3. Check the cluster status to confirm the completion of the scaling down.

The TiDB server is stateless, so the operation is relatively simple. As long as there is no front-end traffic, there should be no major issues.
If it is a PD or TiKV stateful node, you will need to restart the cluster, which will be more complicated.

| username: 随缘天空 | Original post link

Scaling down, but this version is too low, not sure if it supports it. Killing the process should also work, you can give it a try.

| username: TiDBer_vfJBUcxl | Original post link

Upgrade the version. The official documentation states that the minimum version for scaling down nodes is 4.0, using tiup.

| username: zhaokede | Original post link

It’s far behind the latest version, try using kill.

| username: Kongdom | Original post link

Version 3.0 also supports scaling down.
https://docs-archive.pingcap.com/zh/tidb/v3.0/scale-tidb-using-ansible

| username: dba远航 | Original post link

Upgrade to the new version

| username: Kongdom | Original post link

v3.0 is also supported.
https://docs-archive.pingcap.com/zh/tidb/v3.0/scale-tidb-using-ansible

| username: Jellybean | Original post link

Scaling Down TiDB Nodes

  1. Modify the frontend load balancer to migrate traffic away.
  2. Stop the corresponding instance node names:
ansible-playbook stop.yml -l tidb-name1,tidb-name2

Or manually stop the service process:

sudo systemctl stop tidb-4000.service
  1. Edit the inventory.ini file to remove the node information.

  2. Update the Prometheus configuration and restart:

ansible-playbook rolling_update_monitor.yml --tags=prometheus
  1. Check
| username: 双开门变频冰箱 | Original post link

Hurry up and upgrade.

| username: 随缘天空 | Original post link

That makes it simple, just refer to the solution directly.

| username: WalterLYU | Original post link

Using sudo systemctl stop tidb-4000.service works.
Thank you :pray: