Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: TiDB-TiKV重启一个节点引起应用连接提示失败
Question: Why does it prompt connection failure when restarting TiKV? Isn’t the connection handled by the TiDB server layer?
Process: TiKV memory usage was very high, so I manually restarted it.
Phenomenon: After stopping, there will be an application prompt indicating connection failure.
OP, please complete the title; having only the word “TiDB” is not appropriate.
As for your mention of restarting TiKV and encountering connection errors on the business side, this is possible. When your business SQL is accessing the leader data of this storage layer node and you restart this node, it will not be able to access the data normally. At this time, the business request will be disconnected by the database and return relevant information.
Is the application reporting an error or is it a TiDB node error? Can you provide a screenshot of the log?
Post the error log, and your version is too low. It’s already 7.5 now.
It’s normal to receive a connection failure message when TiKV is stopped, as the underlying storage cannot be accessed, and the error is expected. External connections connect to TiDB, and TiDB internally connects to TiKV. That’s what I think.
Take a look at your operation records.
Insufficient node instances and replicas will result in the inability to provide normal services…
This is quite normal.
What is the cluster structure like? You can check the logs of the TiDB nodes during the time period when the connection error was prompted to see if there are any issues with the connection from TiDB to TiKV.
The reason for the TIKV restart error might be that the client is currently using the data on TIKV.
Your version is too low. Normally, if it’s a multi-replica TiKV, restarting just one TiKV will only cause a backoff, but the service shouldn’t be interrupted.
It seems normal, there might be a connection accessing the leader.
Stopping a TiDB instance will affect the existing connections to that instance. You can post the specific error messages encountered in your business operations.