K8s TiDB Practice - Operations篇

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: k8s Tidb 实践-运维篇

| username: dba小菜鸡-david

Following the previous article on deploying k8s, this article will conduct basic O&M tests on the TiDB service running on k8s machines.

01 TiDB Component Scaling

Scale down PD to 2 replicas:

kubectl patch -n dba tc dba --type merge --patch '{"spec":{"pd":{"replicas":2}}}'

Scale up PD to 3 replicas:

kubectl patch -n dba tc dba --type merge --patch '{"spec":{"pd":{"replicas":3}}}'
kubectl get po -n dba -o wide

02 Maintain k8s Node

Maintain db53 node

kubectl cordon db53

Scale up a PD, and you will find that no new pods are allocated on db53. Instead, db55 is allocated with 2 pods. This is a simulation, and it is not recommended to have two identical cluster roles on the same node.

Assuming there is a TiKV node, first migrate the TiKV region leader

Add an annotation with key tidb.pingcap.com/evict-leader to the TiKV Pod:

kubectl -n dba annotate pod dba-tikv-2 tidb.pingcap.com/evict-leader="none"

Execute the following command to check if all Region Leaders have been migrated:

kubectl -n dba get tc dba -ojson | jq ".status.tikv.stores | .[] | select ( .podName == \"dba-tikv-2\" ) | .leaderCount"

Rebuild TiKV POD

Check TiKV Pod store-id:

kubectl get -n dba tc dba -ojson | jq ".status.tikv.stores | .[] | select ( .podName == \"dba-tikv-2\" ) | .id"


In any PD Pod, use the pd-ctl command to offline the TiKV Pod:

kubectl exec -n dba dba-pd-0 -- /pd-ctl store delete 1


Before offlining the TiKV Pod, ensure that the remaining TiKV Pods in the cluster are not fewer than the TiKV data replicas configured in PD (configuration item: max-replicas, default value 3). If this condition is not met, you need to scale up TiKV first.

If you encounter the above error, first scale up a TiKV node:

kubectl patch -n dba tc dba --type merge --patch '{"spec":{"tikv":{"replicas":4}}}'

The newly scaled-up node remains in a pending state.

Check the error:

kubectl describe pod dba-tikv-3 -n dba

It indicates that one node is in maintenance status, and the remaining nodes lack sufficient CPU. At this point, we can adjust the resources required by the TiKV node to meet the k8s simulation scenario.

Uncordon the maintenance node:

kubectl uncordon db53.clouddb.bjzdt.qihoo.net

Modify the configuration file:

Apply the configuration:

kubectl replace -f david/tidb-cluster.yaml

You can see that the TiKV pod has restarted.

Then, follow the steps above to maintain the node again, scale up the TiKV replicas to 4, and find that there is no problem now, and the scaling is successful.

Execute the offline TiKV Pod again:

kubectl exec -n dba dba-pd-0 -- /pd-ctl store delete 1


Unbind the TiKV Pod from the current storage.

Query the PersistentVolumeClaim used by the Pod:

kubectl get pvc -n dba

Delete the PersistentVolumeClaim:

The NAME in the above image is the name of the PVC.

kubectl delete -n dba pvc tikv-dba-tikv-2 --wait=false

Delete the TiKV Pod and wait for the newly created TiKV Pod to join the cluster.

kubectl delete -n dba pod dba-tikv-2

pod “dba-tikv-2” deleted
Wait for the newly created TiKV Pod status to become Up.

From the output, you can see that the new TiKV Pod has a new store-id, and the Region Leader will automatically schedule to this TiKV Pod.

Remove the unnecessary evict-leader-scheduler

kubectl exec -n dba dba-pd-0 -- /pd-ctl scheduler remove evict-leader-scheduler-1


Check the current TiKV nodes, and you will see that db53 is no longer there. The db53 node maintenance is successful.

03 Deploy TiDB Monitor

Edit the configuration file:

apiVersion: pingcap.com/v1alpha1
kind: TidbMonitor
  name: dba
  namespace: dba
  - name: dba
    baseImage: prom/prometheus
    version: v2.18.1
    #  cpu: 8000m
    #  memory: 8Gi
    #  cpu: 4000m
    #  memory: 4Gi
    imagePullPolicy: IfNotPresent
    logLevel: info
    reserveDays: 12
      type: NodePort
      portName: http-prometheus
    baseImage: grafana/grafana
    version: 6.0.1
    imagePullPolicy: IfNotPresent
    logLevel: info
    #  cpu: 8000m
    #  memory: 8Gi
    #  cpu: 4000m
    #  memory: 4Gi
    username: admin
    password: admin
      # Configure Grafana using environment variables except GF_PATHS_DATA, GF_SECURITY_ADMIN_USER and GF_SECURITY_ADMIN_PASSWORD
      # Ref https://grafana.com/docs/installation/configuration/#using-environment-variables
      # if grafana is running behind a reverse proxy with subpath http://foo.bar/grafana
      # GF_SERVER_DOMAIN: foo.bar
      # GF_SERVER_ROOT_URL: "%(protocol)s://%(domain)s/grafana/"
      type: NodePort
      portName: http-grafana
    baseImage: pingcap/tidb-monitor-initializer
    version: v6.1.0
    imagePullPolicy: Always
    #  cpu: 50m
    #  memory: 64Mi
    #  cpu: 50m
    #  memory: 64Mi
    baseImage: pingcap/tidb-monitor-reloader
    version: v1.0.1
    imagePullPolicy: IfNotPresent
      type: NodePort
      portName: tcp-reloader
    #  cpu: 50m
    #  memory: 64Mi
    #  cpu: 50m
    #  memory: 64Mi
  imagePullPolicy: IfNotPresent
  persistent: true
  storageClassName: shared-ssd-storage
  storage: 10Gi
  nodeSelector: {}
  annotations: {}
  tolerations: []
  kubePrometheusURL: http://prometheus-k8s.monitoring.svc:9090
  alertmanagerURL: ""

You can confirm the PVC status with the following command:

kubectl get pvc -l app.kubernetes.io/instance=dba,app.kubernetes.io/component=monitor -n dba

Apply the configuration file:

kubectl apply -f tidb_monitor.yaml

Access the Grafana monitoring dashboard

For direct access to monitoring data, you can use kubectl port-forward to access Prometheus:

kubectl port-forward --address -n dba svc/dba-grafana 3000:3000 &>/tmp/portforward-grafana.log &

Grafana monitoring

kubectl port-forward --address -n dba svc/dba-prometheus 9090:9090 &>/tmp/portforward-prometheus.log &

Access Prometheus monitoring data

04 Alertmanager Alert Configuration

If there is an available alertmanager service in the existing infrastructure, you can configure alerts as follows:

Modify the tidb_monitor.yaml configuration file

Reapply the configuration

kubectl replace -f tidb_monitor.yaml

If you want to deploy a separate set of services, refer to the official GitHub - prometheus/alertmanager: Prometheus Alertmanager to deploy the alertmanager component.

docker run --name alertmanager -d -p quay.io/prometheus/alertmanager

Generally, customized alerts are needed, such as sending to email or other internal communication software. The following steps require logging into the docker container to modify the configuration file.

docker ps |grep alert to find the alert docker process id
docker exec -it 25ed6524c91a sh to log into the container

Customize the alert to send to email or webhook as needed.

cat >> /etc/alertmanager/alertmanager.yml << EOF
  smtp_smarthost: 'localhost:25'
  smtp_from: 'alertmanager@example.org'
  smtp_auth_username: 'alertmanager'
  smtp_auth_password: 'password'

  receiver: "webhook-sms"
  group_by: ['env','instance','alertname','type','group','job']
  group_wait: 30s
  group_interval: 3m
  repeat_interval: 3h

- name: 'webhook-sms'
  - url: 'http://api.xxxxx/public/alertmanagerSendXSM'


Stop the pod:

docker stop cdfe1e7beef4

Start the pod:

docker start cdfe1e7beef4

Check the service:

Check the container service of a specific pod:

kubectl get pods dba-tidb-1 -o jsonpath={.spec.containers[*].name}

kubectl exec -it dba-tidb-0 -c tidb -n dba – /bin/sh

-c specifies a particular service

05 Modify Node Configuration Separately

Example for TiKV:

After entering diagnostic mode, modify the configuration.

After letting the TiKV Pod enter diagnostic mode, you can manually modify the TiKV configuration file and specify the modified configuration file to start the TiKV process.

The specific steps are as follows:

Get the TiKV startup command from the TiKV logs, which will be used in subsequent steps.

kubectl logs pod dba-tikv-0 -n dba -c tikv | head -2 | tail -1

The output will be similar to the following, which is the TiKV startup command.

| username: wuxiangdong | Original post link

I want to use it, but I’m afraid I won’t be able to solve any issues that come up. How does the performance of running on k8s compare to the regular setup?

| username: dba小菜鸡-david | Original post link

I am also testing, and there is definitely a performance loss of about 10-20%. Mainly, it feels hard to manage and requires cooperation from the k8s team. Also, we need to see if it’s a strong requirement. It depends on whether we are considering cost savings or so-called automation. I feel that TiDB’s own region-based raft with 3 replicas can handle failover, so it’s not a strong requirement, but it can be experimented with. Currently, we are preparing the test environment, and once we get it running smoothly, we will move it to production.

| username: dba小菜鸡-david | Original post link

There are also too few practical operation and maintenance articles to refer to, and without some basic knowledge of k8s, the cost of making mistakes is too high.