Some Questions About TiKV Proposal Apply

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: 关于 TiKV proposal apply 的一点疑问

| username: TiDBer_bIuwDpsD

At the end of the article TiKV Source Code Reading Series (18) Raft Propose Commit and Apply Scenario Analysis, there is the following description:

There is a special case here, which is the so-called “empty log”. In the implementation of raft-rs, when a new Leader is elected, the new Leader will broadcast an “empty log” to commit the logs from the previous term (for details, please refer to the Raft paper). At this time, there may still be some proposals proposed in the previous term that are still in the pending stage, and because a new Leader has been generated, these proposals can never be confirmed, so we need to clean them up to avoid the associated callback not being called, which would lead to some resources not being released. The cleanup logic refers to the ApplyFsm::handle_entries_normal function.

It mentions — “there may still be some proposals proposed in the previous term that are still in the pending stage, and because a new Leader has been generated, these proposals can never be confirmed.” My question is, shouldn’t these proposals be cleaned up and directly return a StaleCommand error when this leader last lost leadership, rather than waiting until it is re-elected as leader to clean them up?

| username: xfworld | Original post link

The cleanup should be the process of aligning log data after the election is completed.

If the proposal is not confirmed and the submission does not exceed half, the submission fails, which can also ensure consistency.

| username: TiDBer_jYQINSnf | Original post link

My question is, shouldn’t these proposals be cleared and directly return a StaleCommand error when the leader lost leadership last time, instead of waiting until it is re-elected as leader to clear them?

A(leader) [1 2 3 4] B [1 2 3] C [1 2 3]
At this point, an election occurs, and C is elected.
A [1 2 3 4] B [1 2 3] C (leader) [1 2 3]
C needs to apply an empty log.
A [1 2 3 4] B [1 2 3 5] C (leader) [1 2 3 5]
At this point, A also receives 5 and directly deletes 4.
The final state is:
A [1 2 3 5] B [1 2 3 5] C (leader) [1 2 3 5]

| username: TiDBer_bIuwDpsD | Original post link

My understanding is that the term of the first AppendEntries request sent by C to A is 5. At this point, A knows it has lost leadership and should clear all local proposals and return a stale command.

| username: TiDBer_bIuwDpsD | Original post link

Why wait for data alignment when knowing lost leadership? Wouldn’t it be simpler and clearer to clean up directly by design?

| username: TiDBer_jYQINSnf | Original post link

The example I mentioned above, 12345, is the index of the log, and the first empty log is 5.