BR Backup Failed: Failed to Create New OS Thread (Have 50 Already; errno=11)

This topic has been translated from a Chinese forum by GPT and might contain errors.

Original topic: BR备份失败 failed to create new OS thread (have 50 already; errno=11)

| username: realcp1018

[TiDB Usage Environment]
Production Environment
[TiDB Version]
[Encountered Problem: Phenomenon and Impact]
When using the BR command of version v4.0.9 to back up the cluster on the PD Leader node, the following error occurred:

runtime: failed to create new OS thread (have 50 already; errno=11)
runtime: may need to increase max user processes (ulimit -u)
fatal error: newosproc
A bunch of stack traces followed...

Backup command:

/home/tidb/br backup full --pd "${pd-leader}:2379" --storage "local:///br_backup/${cluster_name}/full/2023-04-23T17:57:14Z08:00" --ratelimit 120 --log-file /br_backup/${cluster_name}/full/log/2023-04-23T17:57:14Z08:00.log

The error shows an errno=11 resource shortage error. I checked my ulimit -u which is 4096, and there are very few processes under the tidb user, mainly 3 tikv-servers, 1 pd-server, and 1 tidb-server running.
However, after modifying ulimit -u to 65536, the backup indeed worked.
The question is, I can’t possibly have that many user processes, so why would it trigger ulimit -u? Goroutines shouldn’t count towards user processes either.

| username: Kongdom | Original post link

:thinking: Has this question been asked before?

| username: realcp1018 | Original post link

Sorry, it looks like it’s a duplicate.
I forgot everything, it seems like I handled it last time in a muddled way, so I don’t have a deep impression.
The problem is still the same, the error shows that the number of creations is not many, but changing ulimit -u can actually resolve it.
I looked up some information, Go programs start by creating multiple OS threads, and these are counted in ulimit -u (the RLIMIT_NPROC section of man setrlimit explains: The maximum number of processes (or, more precisely on Linux, threads) that can be created for the real user ID of the calling process. Upon encountering this limit, fork(2) fails with the error EAGAIN.), EAGAIN is errno=11.
So increasing ulimit -u is effective, setting GOMAXPROCS might also be effective but could affect all Go programs, which is a larger scope, and I haven’t tested it yet.

| username: system | Original post link

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.