Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: tidb server是无状态的,这个“无状态”如何理解?
The TiDB layer itself is stateless. In practice, multiple TiDB instances can be started, and a unified access address can be provided externally through load balancing components (such as LVS, HAProxy, or F5). Client connections can be evenly distributed across multiple TiDB instances to achieve load balancing.
I have always been confused about what “stateless” means. Could the experts please explain what “state” refers to here?
             
            
              
              
              
            
           
          
            
            
              
Stateless means not storing data and having no role differentiation between nodes, such as master-slave or leader-follower.
             
            
              
              
              
            
           
          
            
            
              
Not storing data, just like an application, I can start one process or multiple processes without affecting each other.
             
            
              
              
              
            
           
          
            
            
              
You can add N nodes, and then, when the load can be sufficiently handled, you can remove N nodes. From this perspective, it will be relatively easier…
The entire process has no impact on service handling.
             
            
              
              
              
            
           
          
            
            
              
Does not store data and does not maintain any information about the client state when processing requests. Each request is independent, and the service does not rely on the state of previous requests or session information.
             
            
              
              
              
            
           
          
            
            
              
Stateless means not storing specific data, so if a node goes down, it does not affect the entire cluster’s usage.
The computing resources that storage and SQL rely on are different. Storage depends on IO, while computation requires higher CPU and memory.
In TiDB, the SQL layer is stateless.
Specifically, you need to understand what “stateful” and “stateless” mean.
Stateless Service
Each client request must contain self-descriptive information to identify the client’s identity. The server does not save any information about the client requester.
Benefits of Statelessness?
- Client requests do not depend on server information, and multiple requests do not need to access the same server.
- The server cluster and state are transparent to the client, allowing the server to migrate and scale freely, reducing server storage pressure.
What is Stateful?
Stateful services require the server to record client information for each session to identify the client’s identity and process requests based on the user’s identity. A typical design is the session in Tomcat.
For example, login: After a user logs in, we save the login information in the server session and give the user a cookie value to record the corresponding session. Then, in the next request, the user carries the cookie value, allowing us to identify the corresponding session and find the user’s information.
For more understanding of state, please refer to 分布式系统中的“无状态”和“有状态”详解-腾讯云开发者社区-腾讯云
             
            
              
              
              
            
           
          
            
            
              
Compared to PD having a leader and TiKV store having a region leader, TiDB has a GC owner, etc., the statefulness is not strong.
             
            
              
              
              
            
           
          
            
            
              
Stateless transitions, no distinction between master and slave.
             
            
              
              
              
            
           
          
            
            
              
Stateless means no data is stored. Even if all TiDB servers are lost, TiDB will not lose data.
             
            
              
              
              
            
           
          
            
            
              
Stateless means you can respawn in place without losing equipment.
Stateful means you might lose equipment (data loss).
             
            
              
              
              
            
           
          
            
            
              
Stateless means that even if this node’s TiDB server along with the machine is gone, no data will be lost.
             
            
              
              
              
            
           
          
            
            
              
My understanding is that it doesn’t store data, so even if the machine goes down, no data will be lost. However, the query service will still be affected.
             
            
              
              
              
            
           
          
            
            
              
Freely power on/off and add/delete nodes without affecting normal service.
             
            
              
              
              
            
           
          
            
            
              
Simply put, it does not store any data other than the cluster parameters of the TiDB Server itself.
             
            
              
              
              
            
           
          
            
            
              
Service downtime does not affect data storage.
             
            
              
              
              
            
           
          
            
            
              
Stateless means that there is no data involved, and execution occurs in the server’s memory without affecting the database operation.
             
            
              
              
              
            
           
          
            
            
              
Refer to the official documentation, there are two explanations:
- TiDB instances themselves are stateless, and instances cannot perceive each other’s existence, so they cannot confirm whether their writes conflict with those of other TiDB instances.
- TiDB nodes are all stateless, and the nodes themselves do not store data. The nodes are completely equal to each other.
 
            
              
              
              
            
           
          
            
            
              
Stateless means there is no data to be saved and no data to be persisted.
             
            
              
              
              
            
           
          
            
            
              
I understand TiDB server as nginx.