TiKV cannot automatically complete down-peer after enabling placement_rule

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: tikv 在启用 placement_rule 后无法自动补全 down-peer

| username: Smityz

[TiDB Usage Environment] Production Environment / Testing / Poc
Testing Environment
[TiDB Version]
v6.5.3
[Reproduction Path]

  1. Set the placement_rule as follows:
[
  {
    "group_id": "pd",
    "id": "1",
    "start_key": "",
    "end_key": "",
    "role": "voter",
    "is_witness": false,
    "count": 2,
    "label_constraints": [
      {
        "key": "disk_type",
        "op": "in",
        "values": [
          "ssd"
        ]
      }
    ],
    "location_labels": [
      "host"
    ],
    "isolation_level": "host",
    "create_timestamp": 1706696121
  },
  {
    "group_id": "pd",
    "id": "2",
    "start_key": "",
    "end_key": "",
    "role": "follower",
    "is_witness": false,
    "count": 1,
    "label_constraints": [
      {
        "key": "disk_type",
        "op": "in",
        "values": [
          "mix"
        ]
      }
    ],
    "location_labels": [
      "host"
    ],
    "isolation_level": "host",
    "create_timestamp": 1706696121
  }
]
  1. Start 8 nodes, where 3 nodes have the label disk_type=mix, and the other 5 nodes have the label disk_type=ssd. Then, load data and observe that there is no leader on the mix nodes, which is expected.
  2. Force offline one of the mix nodes, and then find that down-peer remains high and does not recover.
  3. After disabling the placement-rule, down-peer starts to recover normally.

I don’t know if this is due to a problem with the logic of the placement rule itself causing the replica to be incomplete or if my policy is causing it to be incomplete.

| username: TiDBer_jYQINSnf | Original post link

First, check the PD logs.

| username: Smityz | Original post link

I didn’t find any suspicious logs. It was after disabling the placement rule that the down-peer-region started to decline.

| username: TiDBer_jYQINSnf | Original post link

I don’t see any problem with your rule either.
You could manually add an operator to add a replica when a down-peer appears and see if it succeeds.
I don’t know much about placement rules, so I’ll also wait for others to solve it.

| username: 裤衩儿飞上天 | Original post link

How long is the max-store-down-time set?

| username: Smityz | Original post link

Manually adding replicas can solve the problem, but it means that the follower node cannot automatically recover after it goes down. I opened an issue on GitHub to look into this problem:
Down-peer-region can’t recover when enable placement-rule policy · Issue #16480 · tikv/tikv (github.com)

| username: Smityz | Original post link

No special modifications, it should be the default value of 30m.
It is worth noting that after I disabled the placement rule, the cluster was able to automatically complete the replicas immediately. I suspect the issue lies with the placement rule.

| username: 裤衩儿飞上天 | Original post link

It feels a bit strange. Is there a server that has deployed two or more TiKV instances?


Additionally, you can try the placement rule in SQL.

| username: Smityz | Original post link

Yes, we have many instances deployed on k8s.
Since we are using rawkv, there is no way to define it with SQL.
If we use SQL, what would the actual plan look like? Would I still encounter the same issue?

| username: 裤衩儿飞上天 | Original post link

  1. If there is a server with 2 or more TiKV deployments, then it is not unusual.

I understand that placement is configured through the SQL interface. I don’t know if there is the same issue, I can only suggest trying it out. Higher versions recommend using SQL for configuration. But you are using rawkv.
3. I don’t have any good suggestions, just wait for other experts or check the answers to your issue.

| username: Smityz | Original post link

Thank you for your help.

| username: dba-kit | Original post link

I suspect this is the cause. It seems that your configuration requires at least one follower replica on label disk_type=mix. Configuring Placement Rule through PD by yourself can easily lead to problems. What is the background for doing this? What effect are you trying to achieve? It looks like you want to decommission the three nodes with label disk_type=mix?

| username: dba-kit | Original post link

I remember that version 6.5.3 supports the coexistence of TiDB and RawKV, and it is even recommended to have a tidb-server node.

| username: 小龙虾爱大龙虾 | Original post link

Without TiDB, I can’t do garbage collection. :joy_cat:

| username: 小龙虾爱大龙虾 | Original post link

You can check the PD panel for your issue to see why the scheduling is not being generated.

| username: Smityz | Original post link

The bug has been fixed in this PR, thanks for everyone’s help!