Beginner's Guide to Handling TiKV Disk Space Shortage and Horizontal Scaling by Adding Nodes

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 小白 学习TIKV磁盘空间不足,水平扩容加节点

| username: TiDBer_bOR8eMEn

【TiDB Usage Environment】Production Environment
【TiDB Version】5.23
【Reproduction Path】Insufficient disk space, expand with a node of the same disk size
【Encountered Problem: Phenomenon and Impact】Since the previous cluster used the default three nodes, all parameters were set to default. Now, I want to bring a new identical node online but have no idea how to proceed. Could any expert provide a detailed tutorial on what preparations are important in this process?
【Resource Configuration】

| username: 江湖故人 | Original post link

The operating system configuration should be optimized first to ensure normal performance;
Scaling out has some impact on the business, so try to choose off-peak periods like weekends. You should at least allocate a day for this data.
Refer to the official documentation for the operation steps.

| username: TiDBer_小阿飞 | Original post link

You can use Tiup to add the scale-out.yml file content to expand the TiKV nodes. Make sure to configure mutual trust between servers and check for potential risks in advance.
TiKV configuration file reference:

tikv_servers:
  - host: 10.0.1.5
    ssh_port: 22
    port: 20160
    status_port: 20180
    deploy_dir: /tidb-deploy/tikv-20160
    data_dir: /tidb-data/tikv-20160
    log_dir: /tidb-deploy/tikv-20160/log

For detailed configuration, refer to the official documentation:

| username: TiDBer_bOR8eMEn | Original post link

Where is the scale-out.yml file? I couldn’t find it.

| username: 像风一样的男子 | Original post link

scale-out.yml TiKV configuration file reference:

tikv_servers:
  - host: 10.0.1.5
    ssh_port: 22
    port: 20160
    status_port: 20180
    deploy_dir: /tidb-deploy/tikv-20160
    data_dir: /tidb-data/tikv-20160
    log_dir: /tidb-deploy/tikv-20160/log
| username: TiDBer_小阿飞 | Original post link

Just add a new scale-out.yml

| username: Kongdom | Original post link

:thinking: You can check if there are other files occupying space on the 0.4 node, such as log files.

| username: TiDBer_bOR8eMEn | Original post link

The file size of the database is inconsistent.

| username: TiDBer_bOR8eMEn | Original post link

If I want to scale out, should I only write the new ones in this configuration file? Do I not need to include the original three?

| username: TiDBer_小阿飞 | Original post link

No need, the file should only include the topology of the expansion part.

| username: TiDBer_bOR8eMEn | Original post link

Got it.

| username: TiDBer_bOR8eMEn | Original post link

Let me ask another question. I have a TiDB cluster with four machines. Can I edit this file on any of the TiDB machines?

| username: tidb菜鸟一只 | Original post link

The simplest method is to use tiup cluster edit-config tidb-test (your cluster name) to find the configuration of your current TiKV node:

tikv_servers:
- host: 10.10.10.14
  ssh_port: 22
  port: 20160
  status_port: 20180
  deploy_dir: /u01/tidb-deploy/tikv-20160
  data_dir: /u01/tidb-data/tikv-20160
  log_dir: /u01/tidb-deploy/tikv-20160/log
  arch: amd64
  os: linux

Copy it out, change the IP, and that will be your scale-out.yml file. This ensures it matches your original configuration exactly.

| username: TiDBer_小阿飞 | Original post link

Yes, it needs to be on a machine with tiup installed because the topology file needs to be executed! As long as the host and port are clearly specified in the topology file, it will be fine.

| username: TiDBer_bOR8eMEn | Original post link

Got it.

| username: TiDBer_bOR8eMEn | Original post link

OK, thank you.

| username: Kongdom | Original post link

No, only editing this file on the control machine will work. If a non-control machine node edits the file, it will only affect the current node.

| username: TiDBer_bOR8eMEn | Original post link

How do I determine which one is my central control machine?

| username: Kongdom | Original post link

In practice, this is indeed difficult to determine. Generally, the server on which tiup is deployed is the control machine. However, it is also possible that tiup is deployed on two servers, as our company has encountered this situation before. In such cases, you can only determine the control machine through the handover sheet or by comparing the configuration files under tiup with the configuration files on each node.

| username: DBAER | Original post link

Refer to the documentation and modify the configuration information accordingly: tiup cluster scale-out | PingCAP 文档中心