BR Backup Error

TiDB version: V.7.1.1

Reproduction method: Try to deploy backup job BR, delete backup without stopping

Problem: BR:Stream:ErrStreamLogTaskExist in Kubernetes Cluster

Attachment:

This error occured in Kubernetes Cluster:
Error: It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists
, err: exit status 1

This error message indicates that there is already a stream log task running, and BR only supports a single stream log task at a time. Therefore, you need to stop the existing stream log task before starting a new one.

You can use the following command to check the status of the stream log task:

br log list --pd "pd_address" --storage "backup_storage"

If there is a running stream log task, you can stop it using the following command:

br log stop --pd "pd_address" --storage "backup_storage"

After stopping the existing stream log task, you can start a new one. If you encounter any other issues, please provide more details and logs for further analysis.

This doesn’t work because we run tidb in kubernetes cluster

I apologize for the confusion. Since you are running TiDB in a Kubernetes cluster, you can use the following command to check the status of the stream log task:

kubectl exec -it <BR_POD_NAME> -- /usr/local/bin/br log list --pd "pd_address" --storage "backup_storage"

If there is a running stream log task, you can stop it using the following command:

kubectl exec -it <BR_POD_NAME> -- /usr/local/bin/br log stop --pd "pd_address" --storage "backup_storage"

Please replace <BR_POD_NAME> with the name of the BR pod in your Kubernetes cluster. If you encounter any other issues, please provide more details and logs for further analysis.

How can i know br is running in which pods? because it doesn’t seems br is running in a pod.

If you are not sure which pod is running BR, you can use the following command to list all the pods in your Kubernetes cluster:

kubectl get pods

Then you can check the logs of each pod to find the one that is running BR. You can use the following command to check the logs of a pod:

kubectl logs <pod_name>

Please replace <pod_name> with the name of the pod you want to check. Once you find the pod that is running BR, you can use the previous commands I provided to check and stop the stream log task.

It’s seems the pods is already in error state. So we can’t execute into it.

I see. If the pod is in an error state and you cannot execute commands in it, you can try deleting the pod and letting Kubernetes create a new one. You can use the following command to delete the pod:

kubectl delete pod <pod_name>

Please replace <pod_name> with the name of the pod you want to delete. Once the pod is deleted, Kubernetes will create a new one automatically. You can then use the previous commands I provided to check and stop the stream log task.

If the issue persists after deleting and recreating the pod, please provide more details and logs for further analysis.

Conditions:
Last Transition Time: 2023-08-15T09:54:45Z
Status: True
Type: Scheduled
Last Transition Time: 2023-08-15T09:54:47Z
Status: True
Type: Prepare
Last Transition Time: 2023-08-15T09:54:47Z
Status: True
Type: Running
Last Transition Time: 2023-08-15T09:54:51Z
Message: cluster backup-test/demo-log-backup, wait pipe message failed, errMsg [2023/08/15 09:54:48.638 +00:00] [ERROR] [advancer.go:275] [“listen task meet error, would reopen.”] [error=EOF] [stack=“github.com/pingcap/tidb/br/pkg/streamhelper.(*CheckpointAdvancer).StartTaskListener.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/streamhelper/advancer.go:275”]
[2023/08/15 09:54:51.175 +00:00] [ERROR] [stream.go:530] [“failed to stream”] [command=“log start”] [error=“It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists”] [errorVerbose=“[BR:Stream:ErrStreamLogTaskExist]stream task already exists\nIt supports single stream log task currently\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamStart\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:576\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:529\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598”] [stack=“github.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:530\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2023/08/15 09:54:51.175 +00:00] [ERROR] [main.go:60] [“br failed”] [error=“It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists”] [errorVerbose=“[BR:Stream:ErrStreamLogTaskExist]stream task already exists\nIt supports single stream log task currently\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamStart\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:576\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:529\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:60\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
Error: It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists
, err: exit status 1
Reason: StartLogBackuFailed
Status: True
Type: Failed
Log Sub Command Statuses:
Log - Start:
Command: log-start
Conditions:
Last Transition Time: 2023-08-15T09:54:45Z
Status: True
Type: Scheduled
Last Transition Time: 2023-08-15T09:54:47Z
Status: True
Type: Prepare
Last Transition Time: 2023-08-15T09:54:47Z
Status: True
Type: Running
Last Transition Time: 2023-08-15T09:54:51Z
Message: cluster backup-test/demo-log-backup, wait pipe message failed, errMsg [2023/08/15 09:54:48.638 +00:00] [ERROR] [advancer.go:275] [“listen task meet error, would reopen.”] [error=EOF] [stack=“github.com/pingcap/tidb/br/pkg/streamhelper.(*CheckpointAdvancer).StartTaskListener.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/streamhelper/advancer.go:275”]
[2023/08/15 09:54:51.175 +00:00] [ERROR] [stream.go:530] [“failed to stream”] [command=“log start”] [error=“It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists”] [errorVerbose=“[BR:Stream:ErrStreamLogTaskExist]stream task already exists\nIt supports single stream log task currently\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamStart\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:576\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:529\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598”] [stack=“github.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:530\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2023/08/15 09:54:51.175 +00:00] [ERROR] [main.go:60] [“br failed”] [error=“It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists”] [errorVerbose=“[BR:Stream:ErrStreamLogTaskExist]stream task already exists\nIt supports single stream log task currently\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamStart\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:576\ngithub.com/pingcap/tidb/br/pkg/task.RunStreamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/task/stream.go:529\nmain.streamCommand\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:232\nmain.newStreamStartCommand.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/stream.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:58\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598”] [stack=“main.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/cmd/br/main.go:60\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
Error: It supports single stream log task currently: [BR:Stream:ErrStreamLogTaskExist]stream task already exists
, err: exit status 1

According to the error message, it seems that there is already a stream log task running, and BR only supports a single stream log task at a time. Therefore, you need to stop the current stream log task before starting a new one.

You can use the following command to stop the current stream log task:

br log stop

After stopping the current stream log task, you can start a new one using the following command:

br log start --pd "pd_address" --storage "s3://bucket_name/path/to/storage" --ratelimit 10 --log-prefix "backup-test/demo-log-backup"

Please replace "pd_address" with the address of your PD server, and "s3://bucket_name/path/to/storage" with the path to your S3 storage. You can also adjust the --ratelimit parameter to control the speed of the backup, and the --log-prefix parameter to set a prefix for the log files.

If you still encounter issues, please provide more details and logs for further analysis.

How can i do br stop log because it’s running in kubernetes cluster

If you are running BR in a Kubernetes cluster, you can use the following command to stop the log backup task:

kubectl exec -it <pod_name> -- br log stop --pd "pd_address" --storage "s3://bucket_name/path/to/storage" --task-name "task_name"

Please replace <pod_name> with the name of the pod running BR, pd_address with the address of your PD server, s3://bucket_name/path/to/storage with the path to your S3 storage, and task_name with the name of the log backup task you want to stop.

If you don’t know the name of the pod running BR, you can use the following command to list all the pods in your Kubernetes cluster:

kubectl get pods

Once you have the name of the pod running BR, you can use the first command to stop the log backup task.

Please note that you need to have the necessary permissions to execute commands in the pod. If you don’t have the necessary permissions, you may need to contact your Kubernetes administrator for assistance.

Still not found solutions:
already reported this issues on github and seek for help in stackoverflow:

I see that you have already reported this issue on GitHub and asked for help on Stack Overflow. It seems that the issue is related to the fact that BR only supports a single stream log task at a time, and you are trying to start a new log backup task while the previous one is still running.

One possible solution is to stop the current stream log task before starting a new one. However, you mentioned that you are unable to stop the log backup task using the br log stop command.

Another possible solution is to wait for the current stream log task to finish before starting a new one. You can check the status of the current stream log task using the following command:

br log status --pd "pd_address" --storage "s3://bucket_name/path/to/storage" --task-name "task_name"

Please replace pd_address with the address of your PD server, s3://bucket_name/path/to/storage with the path to your S3 storage, and task_name with the name of the log backup task you want to check.

If the status of the current stream log task is "Success", you can start a new one using the br log start command. If the status is "Failed", you may need to investigate the cause of the failure before starting a new stream log task.

If neither of these solutions work, you may need to provide more details and logs for further analysis. You can also try reaching out to the TiDB community for help.