Slow TiKV Decommissioning

translator_bot · June 21, 2024, 7:57am

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv下线慢

| username: 等一分钟

[TiDB Usage Environment] Production Environment
[TiDB Version] 6.5.1
[Reproduction Path]
Currently, there are 6 TiKV nodes, each with 2.1G of data. Yesterday, two temporary TiKV nodes were added, and when the data synchronization reached around 40G, the nodes were taken offline. It’s been almost 18 hours now, and they are still Pending Offline. Should I continue to wait?

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

The image is not visible. Please provide the text you need translated.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

The image you provided is not visible. Please provide the text you need translated.

translator_bot · June 21, 2024, 7:57am

| username: TiDBer_小阿飞 | Original post link

First, try to recover it, then take it offline again. It seems to be stuck!

translator_bot · June 21, 2024, 7:57am

| username: 小龙虾爱大龙虾 | Original post link

Is it 2.1G of data per machine? If you can’t take it offline, I suggest checking the PD monitoring panel.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

Before adding new nodes, each TiKV has 2.1T of data.

Does it mean that if the disk usage of a TiKV node exceeds 80%, it will stop migrating?

translator_bot · June 21, 2024, 7:57am

| username: 路在何chu | Original post link

Are both machines going offline at the same time?

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

Yes, I see that the disk timeout on other TiKV nodes has exceeded 80%.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

The image is not visible. Please provide the text you need translated.

translator_bot · June 21, 2024, 7:57am

| username: 烂番薯0 | Original post link

Isn’t it usually an odd number of nodes?

translator_bot · June 21, 2024, 7:57am

| username: 路在何chu | Original post link

This won’t work. With fewer than three replicas, it will be stuck. You need to wait for one TiKV replica to migrate before you can take another TiKV offline.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

There were previously 6 TiKV nodes.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

The link to the image appears to be broken or inaccessible. Please provide the text you need translated.

translator_bot · June 21, 2024, 7:57am

| username: 路在何chu | Original post link

There are no specific requirements for this; we have always had 4 machines.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

Do I need to expand the disks of other TiKV nodes to below 80%?

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

Why are the regions of the newly added node still increasing?

translator_bot · June 21, 2024, 7:57am

| username: 小龙虾爱大龙虾 | Original post link

If the space is less than the low-space-ratio, it will not continue to schedule to this node. It is not recommended to adjust the low-space-ratio configuration because TiKV must reserve a certain amount of space. Otherwise, compaction cannot be performed, as it always requests space first and then releases space

translator_bot · June 21, 2024, 7:57am

| username: 小龙虾爱大龙虾 | Original post link

When there are writes, it will split. The splitting is done by TiKV itself, not by PD scheduling.

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

I performed a scale-in operation on these two machines yesterday. Will data still be written to them?

translator_bot · June 21, 2024, 7:57am

| username: 等一分钟 | Original post link

I need to take these two TiKV nodes offline now, how should I do it?