[FATAL] [server.rs:698] ["failed to start node: Other(\"[components/pd_client/src/util.rs:756]: version should be compatible with version 7.5.0, got 5.0.1\")"]

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: [FATAL] [server.rs:698] [“failed to start node: Other("[components/pd_client/src/util.rs:756]: version should compatible with version 7.5.0, got 5.0.1")”]

| username: Quino

[TiDB Usage Environment] Production Environment
[TiDB Version] Deployed using k8s+kubesphere+operator
[Reproduction Path] Adding kv pod node
[Encountered Problem: Phenomenon and Impact] [FATAL] [server.rs:698] [“failed to start node: Other("[components/pd_client/src/util.rs:756]: version should compatible with version 7.5.0, got 5.0.1")”]
[Resource Configuration]
[Attachments: Screenshot/Logs/Monitoring]

| username: 江湖故人 | Original post link

Is it possible that the component versions are not unified?

| username: Quino | Original post link

Hello, where can I configure the kv version? I am using the operator (v1.6.0-alpha.8) for deployment, all with default configurations.

| username: Quino | Original post link

Sorry, I can’t translate images. Please provide the text you need translated.

| username: 江湖故人 | Original post link

Try replacing all the images with version 7.5.

| username: 江湖故人 | Original post link

TiDB Version Applicable TiDB Operator Version
dev dev
TiDB >= 7.1 1.5 (recommended), 1.4
6.5 <= TiDB < 7.1 1.5, 1.4 (recommended), 1.3
5.4 <= TiDB < 6.5 1.4, 1.3 (recommended)
5.1 <= TiDB < 5.4 1.4, 1.3 (recommended), 1.2
3.0 <= TiDB < 5.1 1.4, 1.3 (recommended), 1.2, 1.1
2.1 <= TiDB < v3.0 1.0 (no longer maintained)
| username: Quino | Original post link

Okay👌 Thanks for the reply🙏 I’ll give it a try.

| username: dba远航 | Original post link

Replace version 5 with version 7.

| username: Quino | Original post link

Hello, thank you for your reply :pray:. Can the image version be changed freely in the online environment? Will there be any data loss?

| username: Quino | Original post link

The version of the images for kitv, tidb, and tipd displayed in the YAML configuration file is v6.5.0, but after the pod starts, for some reason, kv changes to 5.0.1.

| username: TiDBer_jYQINSnf | Original post link

Use kubectl describe pod to check if the TiKV image is really 5.0.1.

| username: Quino | Original post link

It seems that the specific image version information is not displayed, only the MD5 value is shown.

It is indeed version 5.0.1, and the sha256 matches.

| username: TiDBer_jYQINSnf | Original post link

In the tidbcluster, do not write the version number in the image path, and then specify a version separately in spec: version. Similar to the example below, I have removed other content and only kept the image and version. In this way, all components will use the same version. The example is from: tidb-operator/examples/basic-random-password/tidb-cluster.yaml at master · pingcap/tidb-operator · GitHub

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  name: basic
spec:
  version: v7.1.1
  helper:
    image: alpine:3.16.0
  pd:
    baseImage: pingcap/pd
  tikv:
    baseImage: pingcap/tikv
  tidb:
    baseImage: pingcap/tidb
| username: Quino | Original post link

Okay, thank you for the reply! I’ll give it a try.

| username: Quino | Original post link

Hello, boss. Yesterday, I followed this configuration to move the version outside, but it still didn’t take effect. It’s still version 5.0.1 :joy:

| username: TiDBer_jYQINSnf | Original post link

This is a bit strange.

  1. Check the sts: kubectl get sts xxx-tikv -n xxx -oyaml to see if the image of TiKV is the correct version. If it is correct, continue to check the pod.
  2. Check the pod: kubectl get pod xxx-tikv-x -n xxx -oyaml to see the image. If it is still correct, go to the node to check.
  3. Log in to the node where this node is located and execute docker image list to see if the tag of TiKV is wrong. Directly delete this image:
    docker rmi xxx
  4. Then recreate the pod.
| username: Quino | Original post link

I checked as you said, and the image content in kubectl get pod is:

image: docker.io/pingcap/tikv:latest
imageID: docker.io/pingcap/tikv@sha256:2b0992519eb2cabdf22291a7066c0ab5cb93373825366c5b6cf97b273eb2cb53

I compared this sha256 value on Docker Hub, and it corresponds to version 5.0.1. However, the other kv pods are like this:

image: docker.io/pingcap/tikv:latest
imageID: docker.io/pingcap/tikv@sha256:d2adb67c75e9d25dda8c8c367c1db269e079dfd2f8427c8aff0ff44ec1c1be09

The kv pods on the other nodes can run normally without this issue, but only two nodes have this problem.

Then I deleted and recreated the problematic kv pod nodes, but they still have the same sha256 and version (5.0.1) :joy:

| username: TiDBer_jYQINSnf | Original post link

  1. “latest” is definitely incorrect. The “latest” on Docker Hub is not 5.0.1. Could it be that the Docker repository addresses of your two nodes have been mirrored to a third-party address?
  2. Check if the version in sts is correct. If this is also incorrect, then you need to check the operator. The running image should not have a version like “latest”.
| username: Quino | Original post link

I just checked the status of sts as shown in the pictures below:


image
The image displayed by sts is indeed incorrect.

My operator version is v1.6.0-alpha.9, deployed in the application template.


Thanks for the reply!

| username: Quino | Original post link

This is the entire content of my tidb-cluster CRD
apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: >
      {"apiVersion":"pingcap.com/v1alpha1","kind":"TidbCluster","metadata":{"annotations":{"kubesphere.io/creator":"admin","meta.helm.sh/release-name":"tidb-cluster","meta.helm.sh/release-namespace":"tidb-cluster","pingcap.com/ha-topology-key":"kubernetes.io/hostname","pingcap.com/pd.tidb-cluster-pd.sha":"cfa0d77a","pingcap.com/tidb.tidb-cluster-tidb.sha":"866b9771","pingcap.com/tikv.tidb-cluster-tikv.sha":"1c8d5543"},"labels":{"app.kubernetes.io/component":"tidb-cluster","app.kubernetes.io/instance":"tidb-cluster","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"tidb-cluster","app.kubesphere.io/instance":"tidb-cluster","helm.sh/chart":"tidb-cluster-v1.6.0-alpha.8"},"name":"tidb-cluster","namespace":"tidb-cluster"},"spec":{"discovery":{},"enablePVReclaim":false,"helper":{"image":"busybox:1.34.1"},"imagePullPolicy":"IfNotPresent","pd":{"affinity":{},"baseImage":"pingcap/pd","enableDashboardInternalProxy":true,"hostNetwork":false,"image":"pingcap/pd:v6.5.0","imagePullPolicy":"IfNotPresent","maxFailoverCount":3,"replicas":3,"requests":{"storage":"1Gi"},"startTimeout":30,"storageClassName":"nfs-client"},"pvReclaimPolicy":"Retain","schedulerName":"tidb-scheduler","services":[{"name":"pd","type":"ClusterIP"}],"tidb":{"affinity":{},"baseImage":"pingcap/tidb","binlogEnabled":false,"hostNetwork":false,"image":"pingcap/tidb:v6.5.0","imagePullPolicy":"IfNotPresent","maxFailoverCount":3,"replicas":2,"separateSlowLog":true,"slowLogTailer":{"image":"busybox:1.33.0","imagePullPolicy":"IfNotPresent","limits":{"cpu":"100m","memory":"50Mi"},"requests":{"cpu":"20m","memory":"5Mi"}},"tlsClient":{}},"tikv":{"affinity":{},"baseImage":"pingcap/tikv","hostNetwork":false,"image":"pingcap/tikv:v6.5.0","imagePullPolicy":"IfNotPresent","maxFailoverCount":3,"replicas":3,"requests":{"storage":"10Gi"},"scalePolicy":{"scaleInParallelism":1,"scaleOutParallelism":1},"storageClassName":"nfs-client"},"timezone":"UTC","tiproxy":{"baseImage":"pingcap/tiproxy","imagePullPolicy":"IfNotPresent","replicas":0,"requests":{"storage":"1Gi"},"storageClassName":"nfs-client","version":"v6.5.0"},"tlsCluster":{},"version":""}}
    kubesphere.io/creator: admin
    meta.helm.sh/release-name: tidb-cluster
    meta.helm.sh/release-namespace: tidb-cluster
    pingcap.com/ha-topology-key: kubernetes.io/hostname
    pingcap.com/pd.tidb-cluster-pd.sha: cfa0d77a
    pingcap.com/tidb.tidb-cluster-tidb.sha: 866b9771
    pingcap.com/tikv.tidb-cluster-tikv.sha: 1c8d5543
  labels:
    app.kubernetes.io/component: tidb-cluster
    app.kubernetes.io/instance: tidb-cluster
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: tidb-cluster
    app.kubesphere.io/instance: tidb-cluster
    helm.sh/chart: tidb-cluster-v1.6.0-alpha.8
  name: tidb-cluster
  namespace: tidb-cluster
spec:
  discovery: {}
  enablePVReclaim: false
  version: v7.5.0
  helper:
    image: 'busybox:1.34.1'
  imagePullPolicy: IfNotPresent
  pd:
    affinity: {}
    baseImage: pingcap/pd
    enableDashboardInternalProxy: true
    hostNetwork: false
    imagePullPolicy: IfNotPresent
    maxFailoverCount: 3
    replicas: 3
    requests:
      storage: 1Gi
    startTimeout: 30
    storageClassName: nfs-client
  pvReclaimPolicy: Retain
  schedulerName: tidb-scheduler
  services:
    - name: pd
      type: ClusterIP
  tidb:
    affinity: {}
    baseImage: pingcap/tidb
    binlogEnabled: false
    hostNetwork: false
    imagePullPolicy: IfNotPresent
    maxFailoverCount: 3
    replicas: 2
    separateSlowLog: true
    slowLogTailer:
      image: 'busybox:1.33.0'
      imagePullPolicy: IfNotPresent
      limits:
        cpu: 100m
        memory: 50Mi
      requests:
        cpu: 20m
        memory: 5Mi
    tlsClient: {}
  tikv:
    affinity: {}
    baseImage: pingcap/tikv
    image: 'pingcap/tikv'
    hostNetwork: false
    imagePullPolicy: IfNotPresent
    maxFailoverCount: 3
    replicas: 3
    requests:
      storage: 20Gi
    scalePolicy:
      scaleInParallelism: 1
      scaleOutParallelism: 1
    storageClassName: nfs-client
  timezone: Asia/Shanghai
  tiproxy:
    baseImage: pingcap/tiproxy
    imagePullPolicy: IfNotPresent
    replicas: 0
    requests:
      storage: 1Gi
    storageClassName: nfs-client
    version: v6.5.0
  tlsCluster: {}