TiDB Pod Not Created During k8s Deployment

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: k8s 部署时 tidb pod未创建

| username: h5n1

Continuing from the previous post https://asktug.com/t/topic/999565/4:

After applying the cluster configuration file, PD and TiKV are running, but the TiDB pod has not been created. How should I analyze this?

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  name: tidb-test-cluster
  namespace: default
spec:
  timezone: UTC
  configUpdateStrategy: RollingUpdate
  imagePullPolicy: Always
  helper:
    image: alpine:3.16.0
  pvReclaimPolicy: Retain
  discovery: {}
  enableDynamicConfiguration: true
  pd:
    baseImage: 10.172.49.246/zongbu-sre/pd-arm64:v6.1.2
    config: |
      [dashboard]
        internal-proxy = true
    replicas: 3
    maxFailoverCount: 0
    requests:
      cpu: 2000m
      memory: 12Gi
      storage: 100Gi
    limits:
      cpu: 2000m
      memory: 12Gi
      storage: 100Gi
    storageClassName: "pd-ssd-storage"
  tidb:
    baseImage: 10.172.49.246/zongbu-sre/tidb-arm64:v6.1.2
    config: |
      [performance]
        tcp-keep-alive = true
    replicas: 3
    maxFailoverCount: 0
    requests:
      cpu: 2000m
      memory: 16Gi
      storage: 100Gi
    limits:
      cpu: 2000m
      memory: 16Gi
      storage: 100Gi
    service:
      type: "ClusterIP"
    storageClassName: "tidb-storage"
  tikv:
    baseImage: 10.172.49.246/zongbu-sre/tikv-arm64:v6.1.2
    config: |
      log-level = "info"
    replicas: 6
    maxFailoverCount: 0
    requests:
      cpu: 2000m
      memory: 32Gi
      storage: 500Gi
    limits:
      cpu: 2000m
      memory: 32Gi
      storage: 500Gi
    storageClassName: "tikv-ssd-storage" 

I0110 13:49:42.667328       1 tidb_cluster_controller.go:131] TidbCluster: default/tidb-test-cluster, still need sync: TidbCluster: [default/tidb-test-cluster], waiting for TiKV cluster running, requeuing
I0110 13:49:49.794007       1 tikv_member_manager.go:834] TiKV of Cluster default/tidb-test-cluster not bootstrapped yet
I0110 13:49:49.799414       1 tikv_member_manager.go:938] TiKV of Cluster default/tidb-test-cluster is not bootstrapped yet, no need to set store labels
I0110 13:49:49.800033       1 tidb_cluster_controller.go:131] TidbCluster: default/tidb-test-cluster, still need sync: TidbCluster: [default/tidb-test-cluster], waiting for TiKV cluster running, requeuing
I0110 13:50:19.811670       1 tikv_member_manager.go:834] TiKV of Cluster default/tidb-test-cluster not bootstrapped yet
I0110 13:50:19.818649       1 tikv_member_manager.go:938] TiKV of Cluster default/tidb-test-cluster is not bootstrapped yet, no need to set store labels
I0110 13:50:19.819283       1 tidb_cluster_controller.go:131] TidbCluster: default/tidb-test-cluster, still need sync: TidbCluster: [default/tidb-test-cluster], waiting for TiKV cluster running, requeuing

| username: xfworld | Original post link

You can check the operator’s logs to find the errors. The official website also provides an analysis plan.

Refer to the following:

| username: tidb菜鸟一只 | Original post link

kubectl describe tidbclusters -n ${namespace} ${cluster_name}

| username: h5n1 | Original post link

It looks like TiKV cannot find PD.
— tikv pod log:

[2023/01/11 01:16:20.611 +00:00] [INFO] [util.rs:587] ["connecting to PD endpoint"] [endpoints=http://tidb-test-cluster-pd:2379]
[2023/01/11 01:16:22.611 +00:00] [INFO] [util.rs:549] ["PD failed to respond"] [err="Grpc(RpcFailure(RpcStatus { code: 4-DEADLINE_EXCEEDED, message: \"Deadline Exceeded\", details: [] }))"] [endpoints=http://tidb-test-cluster-pd:2379]
[2023/01/11 01:16:22.611 +00:00] [WARN] [client.rs:163] ["validate PD endpoints failed"] [err="Other(\"[components/pd_client/src/util.rs:582]: PD cluster failed to respond\")"]
[2023/01/11 01:16:22.912 +00:00] [INFO] [util.rs:587] ["connecting to PD endpoint"] [endpoints=http://tidb-test-cluster-pd:2379]

./pd-ctl store

Failed to get store: [500] "[PD:cluster:ErrNotBootstrapped]TiKV cluster not bootstrapped, please start TiKV first"

ps -ef|grep tikv

root           1       0  0 01:13 ?        00:00:00 /tikv-server --pd=http://tidb-test-cluster-pd:2379 --advertise-addr=tidb-test-cluster-tikv-5.tidb-test-cluster-tikv-peer.default.svc:20160 --addr=0.0.0.0:20160 --status-addr=0.0.0.0:20180 --advertise-status-addr=tidb-test-cluster-tikv-5.tidb-test-cluster-tikv-peer.default.svc:20180 --data-dir=/var/lib/tikv --capacity=500GB --config=/etc/tikv/tikv.toml

kubectl get svc -A

NAMESPACE     NAME                          TYPE        CLUSTER-IP        EXTERNAL-IP   PORT(S)                  AGE
default       kubernetes                    ClusterIP   192.168.0.1       <none>        443/TCP                  2d17h
default       tidb-test-cluster-discovery   ClusterIP   192.168.104.90    <none>        10261/TCP,10262/TCP      11h
default       tidb-test-cluster-pd          ClusterIP   192.168.153.231   <none>        2379/TCP                 11h
default       tidb-test-cluster-pd-peer     ClusterIP   None              <none>        2380/TCP,2379/TCP        11h
default       tidb-test-cluster-tikv-peer   ClusterIP   None              <none>        20160/TCP                11h
kube-system   kube-dns                      ClusterIP   192.168.0.222     <none>        53/UDP,53/TCP,9153/TCP   5d13h

kubectl describe svc tidb-test-cluster-pd

Name:              tidb-test-cluster-pd
Namespace:         default
Labels:            app.kubernetes.io/component=pd
                   app.kubernetes.io/instance=tidb-test-cluster
                   app.kubernetes.io/managed-by=tidb-operator
                   app.kubernetes.io/name=tidb-cluster
                   app.kubernetes.io/used-by=end-user
Annotations:       pingcap.com/last-applied-configuration:
                     {"ports":[{"name":"client","protocol":"TCP","port":2379,"targetPort":2379}],"selector":{"app.kubernetes.io/component":"pd","app.kubernetes...
Selector:          app.kubernetes.io/component=pd,app.kubernetes.io/instance=tidb-test-cluster,app.kubernetes.io/managed-by=tidb-operator,app.kubernetes.io/name=tidb-cluster
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                192.168.153.231
IPs:               192.168.153.231
Port:              client  2379/TCP
TargetPort:        2379/TCP
Endpoints:         172.16.112.142:2379,172.16.228.152:2379,172.16.252.53:2379
Session Affinity:  None
Events:            <none>
| username: xfworld | Original post link

Refer to the network configuration and troubleshooting sections of K8S, which are already quite complex… :rofl:

| username: h5n1 | Original post link

Resolved, took significant actions, cleared iptables, restarted the host, deleted and reconfigured the previous PV/cluster. It should still be a network issue.

| username: h5n1 | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.