Shutting Down a TiKV

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关掉一个tikv

| username: TiDBer_Y2d2kiJh

[TiDB Usage Environment] Production Environment
[TiDB Version] v5.4.0
[Reproduction Path] What operations were performed when the issue occurred v5.4.0
[Encountered Issue: Issue Phenomenon and Impact] Currently, the database experiences high latency during working hours. I want to shut down one of the three TiKV nodes and then add memory and CPU one by one. I’m not sure if this operation is feasible or if there are any precautions to take.
[Resource Configuration]
2 TiDB nodes 32G16C/node
2 PD nodes 32G16C/node
3 TiKV nodes 32G16C/node

[Attachments: Screenshots/Logs/Monitoring]

| username: tidb菜鸟一只 | Original post link

Is it possible to free up one more machine? It is recommended to expand first and then shrink.

| username: TiDBer_Y2d2kiJh | Original post link

This is indeed a good solution, but it has just been confirmed that there is no server available.

| username: xfworld | Original post link

If it’s read-only without writing, you can operate according to your proposed idea.

If you want to ensure the business environment, it’s best to expand first and then shrink…

| username: tidb菜鸟一只 | Original post link

If there are no other machines available, this method can also work. However, to be safe, you need to stop all services first, then evict all leaders from one TiKV node to other nodes, shut down this machine, scale out, and then restart this node, doing this one by one. Of course, if you’re bold enough, you can directly force kill one TiKV node, scale out, and then restart it.

| username: TiDBer_pkQ5q1l0 | Original post link

Shutting down one node won’t affect it. First, evict the leader, then scale out, and finally turn off the eviction.

| username: TiDBer_Y2d2kiJh | Original post link

Could you please explain how to evict a leader?

| username: TiDBer_jYQINSnf | Original post link

No need to evict, just unplug the network cable. A new leader will be elected quickly. However, there is a prerequisite: you need to check that all regions indeed have 3 replicas.
Check the Grafana PD dashboard for region monitoring.

Alternatively:
pd-ctl region check miss-peer

| username: dba-kit | Original post link

You can check the pd-ctl documentation. The command to evict the leader of a specific store is scheduler add evict-leader-scheduler <store-id>.

| username: dba-kit | Original post link

After all the leaders have been evicted (check Grafana monitoring), you can then use tiup cluster stop <cluster-name> -N <tikv-instance>.
PS: Also, pay attention to "max-store-down-time": "30m0s" (a configuration in pd-ctl) to avoid triggering region supplementation operations if the upgrade time is too long.

| username: 孤君888 | Original post link

No impact.

| username: TiDBer_Y2d2kiJh | Original post link

Hello, how can I determine if the data has been balanced after scaling out?

| username: tidb菜鸟一只 | Original post link

Check the Grafana TiKV-related panels to see the TiKV scores and the number of regions.

| username: TiDBer_pkQ5q1l0 | Original post link

grafana-overview-tikv, seeing that the leaders are almost the same and the regions are not much different, it balanced out.

| username: TiDBer_Y2d2kiJh | Original post link

I don’t know why the login password for Grafana is no longer the default “admin”. Could you please tell me how to change the login password for Grafana?

| username: DBRE | Original post link

Reference: Grafana密码重置 - 灰信网(软件开发博客聚合)

| username: TiDBer_Y2d2kiJh | Original post link

Thank you, it’s been handled.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.