Will a TiFlash node going down affect cluster access?

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: TiFlash节点down掉会影响集群访问么?

| username: Kongdom

[TiDB Usage Environment] Online
[TiDB Version] v5.4.2
[Encountered Problem]
The cluster has 2 TiFlash nodes. After one node loses network connection, the cluster cannot query related data normally. Does TiFlash also need to keep more than half of the nodes alive to be available?
[Reproduction Path] What operations were performed to cause the problem
[Problem Phenomenon and Impact]

[Attachment]

| username: h5n1 | Original post link

The image is not visible. Please provide the text you need translated.

| username: Kongdom | Original post link

Does TiFlash also need to keep more than half of the nodes alive to be available?

| username: 啦啦啦啦啦 | Original post link

You don’t need half of them, as long as one node with a replica is not down, it will be fine.

| username: h5n1 | Original post link

Theoretically, it is just a Learner and does not participate in voting, so there is no concept of a majority. I feel that the connection handling is not good enough. When a certain TiFlash returns some errors, it causes the SQL to fail, which is why there is the fallback_to_tikv parameter.

Problem Summary:
When we get a backoff error from TiFlash, the SQL will always fail while the TiKV nodes are actually alive. We hope to fallback to TiKV after TiFlash is down.

| username: Kongdom | Original post link

Oh, then there might be an issue with this node. I’ll check it again tonight.

| username: Kongdom | Original post link

I’ll take another look in the evening.

| username: Kongdom | Original post link

This topic will be automatically closed 60 days after the last reply. No new replies are allowed.