[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zgmi5rhm.fsf@email.froward.int.ebiederm.org>
Date: Tue, 22 Feb 2022 18:57:57 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Michal Koutný <mkoutny@...e.com>
Cc: Etienne Dechamps <etienne@...champs.fr>,
Alexey Gladkov <legion@...nel.org>,
Kees Cook <keescook@...omium.org>,
Shuah Khan <shuah@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Solar Designer <solar@...nwall.com>,
Ran Xiaokai <ran.xiaokai@....com.cn>,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
Linux Containers <containers@...ts.linux-foundation.org>
Subject: Re: [RFC PATCH 0/6] RLIMIT_NPROC in ucounts fixups
Michal Koutný <mkoutny@...e.com> writes:
> On Sat, Feb 12, 2022 at 03:32:30PM +0000, Etienne Dechamps <etienne@...champs.fr> wrote:
>> I'm not sure I understand everything that's going on in this thread but it
>> does seem very relevant. You guys might want to double-check the behavior in
>> the particular scenario described there. I'm mostly sending this to make
>> sure everything is cross-linked.
>
> Thanks for the report with strace.
>
> AFAICT, it's caused by setresuid() after unshare(), i.e. all root's
> tasks are (wrongly) compared against the lowered RLIMIT_NPROC.
>
> This is tackled by my RFC patch 2/6 [1] or Eric's variant but 3/8
> (equivalent fix for this case but I haven't run that build).
>
> Michal
>
> [1] I could run your test (LimitNPROC=1 actually) against kernel with my
> patches and the service starts.
So I looked into this and our previous patchsets (but not my final one)
did resolve this.
What fixed it and what is needed to fix this is not enforcing
RLIMIT_NPROC when the user who creates the user namespace is INIT_USER.
AKA something like the patch below. It is a regression so if at all
possible it needs to be fixed, and it is certainly possible.
The patch below feels right at first glance, but I am not convinced that
testing cred->user or cred->ucounts is the proper test so I am going to
sleep on this a little bit.
I did want everyone to know I looked into this and I am going to ensure
this gets fixed.
diff --git a/kernel/fork.c b/kernel/fork.c
index 17d8a8c85e3b..532ce5cbf851 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2027,7 +2027,7 @@ static __latent_entropy struct task_struct *copy_process(
retval = -EAGAIN;
if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
- if (p->real_cred->user != INIT_USER &&
+ if (p->real_cred->ucounts != &init_ucounts &&
!capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
goto bad_fork_cleanup_count;
}
diff --git a/kernel/sys.c b/kernel/sys.c
index 97dc9e5d6bf9..7b5d74a7845c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -490,7 +490,7 @@ static void flag_nproc_exceeded(struct cred *new)
* failure to the execve() stage.
*/
if (is_ucounts_overlimit(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) &&
- new->user != INIT_USER)
+ new->ucounts != &init_ucounts)
current->flags |= PF_NPROC_EXCEEDED;
else
current->flags &= ~PF_NPROC_EXCEEDED;
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 6b2e3ca7ee99..925fb3579ef3 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -123,6 +123,8 @@ int create_user_ns(struct cred *new)
ns->ucount_max[i] = INT_MAX;
}
set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC));
+ if (new->ucounts == &init_ucounts)
+ set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, RLIM_INFINITY);
set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MSGQUEUE, rlimit(RLIMIT_MSGQUEUE));
set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_SIGPENDING, rlimit(RLIMIT_SIGPENDING));
set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MEMLOCK, rlimit(RLIMIT_MEMLOCK));
Powered by blists - more mailing lists