Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.
Original topic: 单机模拟部署生产集群,TiKV无法连接PD
[TiDB Usage Environment] Production Environment / Testing / Poc
Testing
[TiDB Version]
v7.1.0
[Reproduction Path] What operations were performed when the issue occurred
None, single-node deployment of TiDB cluster
[Encountered Issue: Issue Phenomenon and Impact]
When starting the cluster with tiup cluster start, the PD node started successfully, but from the logs, it can be seen that TiKV cannot connect to the PD node.
[Resource Configuration]
[Attachments: Screenshots/Logs/Monitoring]
Based on the provided information, it appears that TiKV is unable to connect to the PD node, possibly due to incorrect configuration of the PD node’s IP address or port number. It is recommended to troubleshoot by following these steps:
-
Confirm whether the IP address and port number of the PD node are correctly configured. You can check the IP address and port number of the PD node using the following command:
tiup cluster display <cluster-name>
Here, <cluster-name>
is the name of your deployed TiDB cluster.
-
Confirm whether the IP address and port number of the PD node are correctly configured in the TiKV node’s configuration file. You can check the TiKV node’s configuration file using the following command:
tiup cluster edit-config <cluster-name>
Then find the TiKV node’s configuration file and confirm whether the IP address and port number of the PD node are correct.
-
If the above two steps are confirmed to be correct, you can try restarting the TiKV node to see if it can connect to the PD node. You can restart the TiKV node using the following command:
tiup cluster restart <cluster-name> tikv <tikv-index>
Here, <cluster-name>
is the name of your deployed TiDB cluster, and <tikv-index>
is the index number of the TiKV node, which can be checked using the following command:
tiup cluster display <cluster-name>
If the TiKV node still cannot connect to the PD node after restarting, you can check the TiKV node’s log file to confirm the specific error information.
Check the PD logs first to resolve the PD startup issue.
I think you should check the firewall settings.
Additionally, the fact that telnet 127.0.0.1 2379 works does not mean that telnet 192.168.150.100 2379 will work. You should try telnet 192.168.150.100 2379 to see if you can connect; it would be more convincing.
Is it possible that the disk space is insufficient?
It is very likely that the firewall is not turned off.
Okay, thank you. I just tried and it’s not due to the firewall. The port is open after starting PD. I just checked the PD logs and there is a memory-related error: panic: runtime error: invalid memory address or nil pointer dereference.
Indeed, PD started for a while and then reported an error.
PD started for a while, then reported an error and exited, causing TiKV to be unable to connect.
Thank you, everyone. It seems like there isn’t enough memory.
Hahaha, from now on, check 1 disk, 2 memory, 3 network, 4 firewall first.
Memory performance issues are worth checking. Use ‘top’ to see if the memory is fully utilized.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.