[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ee42kedj.fsf@email.froward.int.ebiederm.org>
Date: Wed, 16 Feb 2022 09:41:44 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Michal Koutný <mkoutny@...e.com>
Cc: linux-kernel@...r.kernel.org, Alexey Gladkov <legion@...nel.org>,
Kees Cook <keescook@...omium.org>,
Shuah Khan <shuah@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Solar Designer <solar@...nwall.com>,
Ran Xiaokai <ran.xiaokai@....com.cn>,
containers@...ts.linux-foundation.org, stable@...r.kernel.org
Subject: Re: [PATCH 4/8] ucounts: Only except the root user in init_user_ns
from RLIMIT_NPROC
Michal Koutný <mkoutny@...e.com> writes:
> On Thu, Feb 10, 2022 at 08:13:20PM -0600, "Eric W. Biederman" <ebiederm@...ssion.com> wrote:
>> @@ -1881,7 +1881,7 @@ static int do_execveat_common(int fd, struct filename *filename,
> [...]
>> - (current_user() != INIT_USER) &&
>> + (current_ucounts() != &init_ucounts) &&
> [...]
>> @@ -2027,7 +2027,7 @@ static __latent_entropy struct task_struct *copy_process(
> [...]
>> - if (p->real_cred->user != INIT_USER &&
>> + if ((task_ucounts(p) != &init_ucounts) &&
>
> These substitutions make sense to me.
>
>> !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
>> goto bad_fork_cleanup_count;
>> }
>> diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
>> index 6b2e3ca7ee99..f0c04073403d 100644
>> --- a/kernel/user_namespace.c
>> +++ b/kernel/user_namespace.c
>> @@ -123,6 +123,8 @@ int create_user_ns(struct cred *new)
>> ns->ucount_max[i] = INT_MAX;
>> }
>> set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC));
>> + if (new->ucounts == &init_ucounts)
>> + set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, RLIMIT_INFINITY);
>> set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MSGQUEUE, rlimit(RLIMIT_MSGQUEUE));
>> set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_SIGPENDING, rlimit(RLIMIT_SIGPENDING));
>> set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MEMLOCK, rlimit(RLIMIT_MEMLOCK));
>
> First, I wanted to object this double fork_init() but I realized it's
> relevant for newly created user_ns.
>
> Second, I think new->ucounts would be correct at this point and the
> check should be
>
>> if (ucounts == &init_ucounts)
>
> i.e. before set_cred_ucounts() new->ucounts may not be correct.
>
> I'd suggest also a comment in the create_user_ns() explaining the
> reason is to exempt global root from RLIMINT_NRPOC also indirectly via
> descendant user_nss.
Yes.
This one got culled from my next version of the patchset as it is not
conservative enough. I think it is probably the right general
direction.
On further reflection I am not convinced that it makes sense to test
user or ucounts. They are really not fields designed to support
permission checks.
I think if we want to exempt the root user's children from the root
users rlimit using the second set_rlimit_ucount_max is the way to go.
Someone filed a bug that strongly suggests that we want the second
set_rlimit_ucount_max:
https://bugzilla.kernel.org/show_bug.cgi?id=215596
I am still trying to understand that case.
Eric
Powered by blists - more mailing lists