[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1660664258.git.legion@kernel.org>
Date: Tue, 16 Aug 2022 17:42:42 +0200
From: Alexey Gladkov <legion@...nel.org>
To: LKML <linux-kernel@...r.kernel.org>,
Linux Containers <containers@...ts.linux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christian Brauner <brauner@...nel.org>,
"Eric W . Biederman" <ebiederm@...ssion.com>,
Kees Cook <keescook@...omium.org>,
Manfred Spraul <manfred@...orfullife.com>
Subject: Re: [PATCH v1] sysctl: Allow change system v ipc sysctls inside ipc namespace
On Mon, Jul 25, 2022 at 11:16:07AM -0500, Eric W. Biederman wrote:
> Alexey Gladkov <legion@...nel.org> writes:
>
> > Rootless containers are not allowed to modify kernel IPC parameters such
> > as kernel.msgmnb.
> >
> > It seems to me that we can allow customization of these parameters if
> > the user has CAP_SYS_RESOURCE in that ipc namespace.
> >
> > CAP_SYS_RESOURCE is already needed in order to overcome mqueue limits
> > (msg_max and msgsize_max).
>
>
> For changing the permissions on who can modify the SysV limits, I don't
> think this change is safe. I don't see anything that will prevent abuse
> if anyone can modify these limits. Replacing the ordinary unix DAC
> permission check with ns_capable will allow anyone to modify the limits.
All limits are set to almost maximum values - ULONG_MAX. Limit values
are not inherited and are counted in the each ipc namespace (shm_tot is
not global and is located in ipc_ns). In fact, limits are disabled by
default. They can only be reduced.
> That said there is RLIMIT_MSGQUEUE that limits the posix messages queues
> so those should be safe to allow anyone to modify their limits.
>
> The code in mqueue_get_inode is where that limiting happens.
>
> For the posix message queues all that should be needed is to change the
> owner of the sysctl files from the global root to the user namespace
> root. There are also two capable calls in ipc/mqueue.c that can
> probably be changed to ns_capable calls.
>
>
> The only posix message queue limit that I don't immediately see
> something that will prevent abuse of is /proc/sys/fs/mqueue/queus_max.
> That probably still runs into RLIMIT_MSGQUEUE somewhere but it was
> not immediately obvious at first glance.
Everything always ends in mqueue_get_inode. In mqueue_create_attr we
check mq_queues_max and call mqueue_get_inode almost immediately.
I suggest allowing root in user namespace to change ipc namespace
limits.
--
Alexey Gladkov (3):
sysctl: Allow change system v ipc sysctls inside ipc namespace
sysctl: Allow to change limits for posix messages queues
docs: Add information about ipc sysctls limitations
Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++--
ipc/ipc_sysctl.c | 34 ++++++++++++++++---
ipc/mq_sysctl.c | 36 +++++++++++++++++++++
3 files changed, 76 insertions(+), 8 deletions(-)
--
2.33.4
Powered by blists - more mailing lists