Accessing TiDB-v7.1.1 via HAProxy with Lightning

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: lightning通过haproxy访问tidb-v7.1.1

| username: 在路上123

  1. Error Description
  2. Executing the lightning data import command reports an error
tiup is checking updates for component tidb-lightning ...
Starting component `tidb-lightning`: /home/tidb/.tiup/components/tidb-lightning/v7.1.1/tidb-lightning -config tidb-lightning.toml
Verbose debug logs will be written to tidb-lightning.log

tidb lightning encountered error: cannot read schema 'prod_dw_sjzt' from remote: Get "http://xxxxxxxx:10080/schema/prod_dw_sjzt": dial tcp xxxxxxxx:10080: connect: connection refused
  1. The tidb-lightning.toml configuration is as follows

  2. Cause

    1. The access path for lightning to the database is: lightning–>haproxy–>tidb-cluster
    2. When executing data import, lightning finds the host and status-port through the configuration in tidb-lightning.toml, then attempts to access port 10080. However, the host here is the IP address of the haproxy host, and the status-port is the tidb instance port, leading to the error.
  3. Question
    How can this issue be resolved? Is the haproxy method currently unsupported?

| username: 大飞哥online | Original post link

Connection refused: The error message “connection refused” indicates that the connection was rejected.
Network issue: The error message “dial tcp” indicates a problem occurred while attempting to establish a TCP connection.

| username: 像风一样的男子 | Original post link

Is port 10080 open? Does HAProxy proxy TiDB’s port 10080?

| username: 大飞哥online | Original post link

Solution:

  1. Check connection configuration: Ensure the provided connection address and port are correct. You can use telnet to test if the target address and port are reachable.
  2. Check network connection: Ensure the network connection is stable and there are no firewall or network configuration issues.
  3. Check the status of the target server.

You can troubleshoot step by step, first try a direct connection, then connect through haproxy.

| username: 大飞哥online | Original post link

When using the PROXY protocol, you need to set [proxy-protocol.networks] in the tidb-server configuration file.

| username: 大飞哥online | Original post link

You can upload the HAProxy configuration file, and you can also refer to the documentation at HAProxy 在 TiDB 中的最佳实践 | PingCAP 文档中心.

| username: zhanggame1 | Original post link

It should not support HAProxy.

| username: zhanggame1 | Original post link

From the lighting schematic, importing requires access to three components: TiDB, PD, and TiKV. It is evident that HAProxy has not proxied any of them.

| username: Fly-bird | Original post link

Network is down.

| username: 在路上123 | Original post link

I also feel that haproxy is not supported :smile:

| username: 有猫万事足 | Original post link

Just start another HAProxy instance and proxy the 10080 port.

Copy the original file and name the configuration file as haproxy10080.cnf.

The content inside is as follows:

global                                     # Global configuration.
   log         127.0.0.1 local2            # Define the global syslog server, up to two can be defined.
   chroot      /var/lib/haproxy            # Change the current directory and set superuser privileges for the startup process to enhance security.
   pidfile     /var/run/haproxy.pid        # Write the PID of the HAProxy process to the pidfile.
   maxconn     4096                        # Maximum concurrent connections a single HAProxy process can accept, equivalent to the command line parameter "-n".
   nbthread    8                           # Maximum number of threads. The upper limit of threads is the same as the number of CPUs.
   user        haproxy                     # Same as the UID parameter.
   group       haproxy                     # Same as the GID parameter, it is recommended to use a dedicated user group.
   daemon                                  # Let HAProxy work in the background as a daemon process, equivalent to the command line parameter "-D". Of course, it can also be disabled with the "-db" parameter on the command line.
   stats socket /var/lib/haproxy/stats     # Location to save statistics.

defaults                                   # Default configuration.
   log global                              # Logs inherit the settings from the global configuration section.
   retries 2                               # Maximum number of attempts to connect to the upstream server, exceeding this value will consider the backend server unavailable.
   timeout connect 2s                      # Timeout for HAProxy to connect to the backend server. If within the same LAN, it can be set to a shorter time.
   timeout client 30000s                   # Timeout for inactive connections between the client and HAProxy after data transfer is complete.
   timeout server 30000s                   # Timeout for inactive connections on the server side.

listen admin_stats                         # Combination of frontend and backend, the name of this monitoring group can be customized as needed.
   bind 0.0.0.0:8001                       # Listening port. (Change this to separate it from the management port of another instance)
   mode http                               # Mode in which the monitoring runs, here it is `http` mode.
   option httplog                          # Enable logging of HTTP requests.
   maxconn 10                              # Maximum concurrent connections.
   stats refresh 30s                       # Automatically refresh the monitoring page every 30 seconds.
   stats uri /haproxy                      # URL of the monitoring page.
   stats realm HAProxy                     # Prompt information on the monitoring page.
   stats auth admin:pingcap123             # User and password for the monitoring page, multiple usernames can be set.
   stats hide-version                      # Hide the HAProxy version information on the monitoring page.
   stats admin if TRUE                     # Manually enable or disable backend servers (supported from HAProxy 1.4.9 onwards).

listen tidb-cluster                        # Configure database load balancing.
   bind <haproxy ip>:10080                 # Floating IP and listening port. (Change this to bind to port 10080)
   mode tcp                                # HAProxy should use the 4th layer transport layer.
   balance leastconn                       # The server with the least connections receives the connection first. `leastconn` is recommended for long-session services such as LDAP, SQL, TSE, etc., rather than short-session protocols like HTTP. This algorithm is dynamic, and the server weight will be adjusted during operation for slow-starting servers.
   server tidb-1 <tidb1-ip>:10080 check inter 2000 rise 2 fall 3       # Check port 10080, the check frequency is once every 2000 milliseconds. If 2 checks are successful, the server is considered available; if 3 checks fail, the server is considered unavailable.
   server tidb-2 <tidb2-ip>:10080 check inter 2000 rise 2 fall 3

Run it with haproxy -f haproxy10080.cnf and test it with curl to see if you can access the results. Then it can be used.

| username: 有猫万事足 | Original post link

I tried it myself, opening two ports on one HAProxy to proxy two different ports behind it, and it works. I am using HAProxy version 2.5.

global                                     # Global configuration.
   log         127.0.0.1 local2            # Define the global syslog server, up to two can be defined.
   chroot      /var/lib/haproxy            # Change the current directory and set superuser privileges for the startup process to enhance security.
   pidfile     /var/run/haproxy.pid        # Write the PID of the HAProxy process to the pidfile.
   maxconn     4096                        # Maximum concurrent connections a single HAProxy process can accept, equivalent to the command line parameter "-n".
   nbthread    8                           # Maximum number of threads. The upper limit of threads is the same as the number of CPUs.
   user        haproxy                     # Same as the UID parameter.
   group       haproxy                     # Same as the GID parameter, it is recommended to use a dedicated user group.
   daemon                                  # Let HAProxy work in the background as a daemon process, equivalent to the command line parameter "-D". Of course, you can also disable it with the "-db" parameter in the command line.
   stats socket /var/lib/haproxy/stats     # Location to save statistics.

defaults                                   # Default configuration.
   log global                              # Logs inherit the settings from the global configuration section.
   retries 2                               # Maximum number of attempts to connect to the upstream server, beyond which the backend server is considered unavailable.
   timeout connect 2s                      # Timeout for HAProxy to connect to the backend server. If within the same LAN, a shorter time can be set.
   timeout client 30000s                   # Timeout for inactive connections between the client and HAProxy after data transfer is complete.
   timeout server 30000s                   # Timeout for inactive connections on the server side.

listen admin_stats                         # Combination of frontend and backend, the name of this monitoring group can be customized as needed.
   bind 0.0.0.0:8000                       # Listening port.
   mode http                               # Mode in which the monitoring runs, here it is `http` mode.
   option httplog                          # Enable logging of HTTP requests.
   maxconn 10                              # Maximum concurrent connections.
   stats refresh 30s                       # Automatically refresh the monitoring page every 30 seconds.
   stats uri /haproxy                      # URL of the monitoring page.
   stats realm HAProxy                     # Prompt information on the monitoring page.
   stats auth admin:pingcap123             # User and password for the monitoring page, multiple usernames can be set.
   stats hide-version                      # Hide the HAProxy version information on the monitoring page.
   stats admin if TRUE                     # Manually enable or disable backend servers (supported from HAProxy 1.4.9 onwards).

listen tidb-cluster                        # Configure database load balancing.
   bind <haproxy ip>:3306                  # Floating IP and listening port.
   mode tcp                                # HAProxy to use the 4th layer transport layer.
   balance leastconn                       # The server with the least connections receives the connection first. `leastconn` is recommended for long session services such as LDAP, SQL, TSE, etc., rather than short session protocols like HTTP. This algorithm is dynamic, and the server weight will be adjusted during operation for slow-starting servers.
   server tidb-1 <tidb1-ip>:4000 check inter 2000 rise 2 fall 3       # Check port 4000, check frequency is once every 2000 milliseconds. If 2 checks are successful, the server is considered available; if 3 checks fail, the server is considered unavailable.
   server tidb-2 <tidb2-ip>:4000 check inter 2000 rise 2 fall 3

listen tidb-cluster-status                 # Configure database load balancing.
   bind <haproxy ip>:10080                 # Floating IP and listening port.
   mode tcp                                # HAProxy to use the 4th layer transport layer.
   balance leastconn                       # The server with the least connections receives the connection first. `leastconn` is recommended for long session services such as LDAP, SQL, TSE, etc., rather than short session protocols like HTTP. This algorithm is dynamic, and the server weight will be adjusted during operation for slow-starting servers.
   server tidb-1 <tidb1-ip>:10080 check inter 2000 rise 2 fall 3       # Check port 10080, check frequency is once every 2000 milliseconds. If 2 checks are successful, the server is considered available; if 3 checks fail, the server is considered unavailable.
   server tidb-2 <tidb2-ip>:10080 check inter 2000 rise 2 fall 3

This configuration allows HAProxy to open port 3306 to proxy the 4000 port of the TiDB behind it, and at the same time open port 10080 to proxy the 10080 port of the TiDB behind it. I also tested using the lightning logic import successfully. Since I enabled the br log task, the physical import mode is not supported in my environment, so I couldn’t test it. Therefore, I don’t know if it works.

For methods to reload the configuration with minimal packet loss, you can refer to the following link:

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.