K8s tidb-controller-manager continuously reporting errors

Note:
This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: k8s tidb-controller-mananger 持续报错

| username: h5n1

The control manager reports the following error and keeps logging continuously, but the cluster is running normally:

I0112 15:52:55.016078       1 event.go:282] Event(v1.ObjectReference{Kind:"TidbCluster", Namespace:"default", Name:"tidb-test-cluster", UID:"5b91a246-829d-42fd-9c9f-fde6a97ed87e", APIVersion:"pingcap.com/v1alpha1", ResourceVersion:"1534884", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' create StatefulSet tidb-test-cluster-pd in  tidb-test-cluster successful
E0112 15:52:55.966398       1 tidbcluster_control.go:90] failed to update TidbCluster: [default/tidb-test-cluster], error: Operation cannot be fulfilled on tidbclusters.pingcap.com "tidb-test-cluster": StorageError: invalid object, Code: 4, Key: /registry/pingcap.com/tidbclusters/default/tidb-test-cluster, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 5b91a246-829d-42fd-9c9f-fde6a97ed87e, UID in object meta: ae8c4cf6-52c7-42c6-bea6-3dcb9412c050
I0112 15:52:55.966427       1 tidb_cluster_controller.go:131] TidbCluster: default/tidb-test-cluster, still need sync: [TidbCluster: [default/tidb-test-cluster], waiting for PD cluster running, Operation cannot be fulfilled on tidbclusters.pingcap.com "tidb-test-cluster": StorageError: invalid object, Code: 4, Key: /registry/pingcap.com/tidbclusters/default/tidb-test-cluster, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 5b91a246-829d-42fd-9c9f-fde6a97ed87e, UID in object meta: ae8c4cf6-52c7-42c6-bea6-3dcb9412c050], requeuing
I0112 15:52:55.970437       1 event.go:282] Event(v1.ObjectReference{Kind:"TidbCluster", Namespace:"default", Name:"tidb-test-cluster", UID:"5b91a246-829d-42fd-9c9f-fde6a97ed87e", APIVersion:"pingcap.com/v1alpha1", ResourceVersion:"1534884", FieldPath:""}): type: 'Normal' reason: 'Successfully Create' create Role/tidb-test-cluster-discovery for controller TidbCluster/tidb-test-cluster successfully
I0112 15:52:55.973719       1 event.go:282] Event(v1.ObjectReference{Kind:"TidbCluster", Namespace:"default", Name:"tidb-test-cluster", UID:"5b91a246-829d-42fd-9c9f-fde6a97ed87e", APIVersion:"pingcap.com/v1alpha1", ResourceVersion:"1534884", FieldPath:""}): type: 'Normal' reason: 'Successfully Create' create ServiceAccount/tidb-test-cluster-discovery for controller TidbCluster/tidb-test-cluster successfully

| username: tidb菜鸟一只 | Original post link

Your TiKV node keeps restarting? Is there any problem?

| username: h5n1 | Original post link

TiKV was restarted before, and now restarting TiKV by adding annotations through the official website is not effective. Checking the control manager shows the above error, not sure if it is related.

| username: TiDBer_jYQINSnf | Original post link

The status of PD has been incorrect, constantly waiting for rescheduling.
Run kubectl get tc -n xxx -o yaml
to check the status section, how is the PD part?
Run kubectl get tc -n xxx

The ready status should be false, right?

| username: ffeenn | Original post link

What version of K8S are you using?

| username: yiduoyunQ | Original post link

Please provide the operator version and complete logs, as well as the output of get tc -o yaml.

| username: ffeenn | Original post link

Take a look at these issues:
“Precondition failed: UID in precondition” · Issue #82130 · Kubernetes/Kubernetes (github.com)

CM continues requeuing inexisting items · Issue #4437 · cert-manager/cert-manager (github.com)

| username: h5n1 | Original post link

[Version]
OS: 4.19.90-17.ky10.aarch64, k8s: v1.24.9, operator: 1.4.0, tidb 6.1.2
dyrnq/local-volume-provisioner:v2.5.0
The cluster seems to be fine, continuously running read-write tests. The tc configuration is attached.
tc.yaml (9.4 KB)


| username: yiduoyunQ | Original post link

Please provide the complete operator logs for review.

| username: h5n1 | Original post link

manager.log (262.9 KB)

| username: yiduoyunQ | Original post link

You need to first resolve the issues in the k8s environment.

| username: h5n1 | Original post link

When performing these actions, does the content in the logs indicate that tidb-controller-manager communicates directly with the API server to retrieve data and send commands?

| username: TiDBer_jYQINSnf | Original post link

Yes, it is the operator accessing the kube-api.
Failed to update lock: Put “https://192.168.0.1:443/api/v1/namespaces/tidb-admin/endpoints/tidb-controller-manager”: context deadline exceeded

| username: yiduoyunQ | Original post link

For specific principles, refer to TiDB Operator 架构 | PingCAP 文档中心 and TiDB Operator RBAC 规则 | PingCAP 文档中心

| username: h5n1 | Original post link

The specific reason is not clear, but everything is normal after restarting several components of the k8s master.

| username: h5n1 | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.