[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebf480701cd22da00c89c5b1b00d31be95ff8e4d.camel@redhat.com>
Date: Thu, 14 Dec 2023 10:37:01 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Maciej Żenczykowski <maze@...gle.com>, Maciej
Żenczykowski
<zenczykowski@...il.com>
Cc: Linux Network Development Mailing List <netdev@...r.kernel.org>, "David
S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Flavio Crisciani <fcrisciani@...gle.com>,
"Theodore Y. Ts'o" <tytso@...gle.com>, "Eric W. Biederman"
<ebiederm@...ssion.com>, Dmitry Torokhov <dmitry.torokhov@...il.com>
Subject: Re: [PATCH] net: sysctl: fix edge case wrt. sysctl write access
On Sun, 2023-12-10 at 03:10 -0800, Maciej Żenczykowski wrote:
> The clear intent of net_ctl_permissions() is that having CAP_NET_ADMIN
> grants write access to networking sysctls.
>
> However, it turns out there is an edge case where this is insufficient:
> inode_permission() has an additional check on HAS_UNMAPPED_ID(inode)
> which can return -EACCES and thus block *all* write access.
>
> Note: AFAICT this check is wrt. the uid/gid mapping that was
> active at the time the filesystem (ie. proc) was mounted.
>
> In order for this check to not fail, we need net_ctl_set_ownership()
> to set valid uid/gid. It is not immediately clear what value
> to use, nor what values are guaranteed to work.
> It does make sense that /proc/sys/net appear to be owned by root
> from within the netns owning userns. As such we only modify
> what happens if the code fails to map uid/gid 0.
> Currently the code just fails to do anything, which in practice
> results in using the zeroes of freshly allocated memory,
> and we thus end up with global root.
> With this change we instead use the uid/gid of the owning userns.
> While it is probably (?) theoretically possible for this to *also*
> be unmapped from the /proc filesystem's point of view, this seems
> much less likely to happen in practice.
>
> The old code is observed to fail in a relatively complex setup,
> within a global root created user namespace with selectively
> mapped uid/gids (not including global root) and /proc mounted
> afterwards (so this /proc mount does not have global root mapped).
> Within this user namespace another non privileged task creates
> a new user namespace, maps it's own uid/gid (but not uid/gid 0),
> and then creates a network namespace. It cannot write to networking
> sysctls even though it does have CAP_NET_ADMIN.
I'm wondering if this specific scenario should be considered a setup
issue, and should be solved with a different configuration? I would
love to hear others opinions!
> This is because net_ctl_set_ownership fails to map uid/gid 0
> (because uid/gid 0 are *not* mapped in the owning 2nd level user_ns),
> and falls back to global root.
> But global root is not mapped in the 1st level user_ns,
> which was inherited by the /proc mount, and thus fails...
>
> Note: the uid/gid of networking sysctls is of purely superficial
> importance, outside of this UNMAPPED check, it does not actually
> affect access, and only affects display.
>
> Access is always based on whether you are *global* root uid
> (or have CAP_NET_ADMIN over the netns) for user write access bits
> (or are in *global* root gid for group write access bits).
>
> Cc: Flavio Crisciani <fcrisciani@...gle.com>
> Cc: "Theodore Y. Ts'o" <tytso@...gle.com>
> Cc: "Eric W. Biederman" <ebiederm@...ssion.com>
> Cc: Dmitry Torokhov <dmitry.torokhov@...il.com>
> Fixes: e79c6a4fc923 ("net: make net namespace sysctls belong to container's owner")
> Signed-off-by: Maciej Żenczykowski <maze@...gle.com>
> ---
> net/sysctl_net.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/net/sysctl_net.c b/net/sysctl_net.c
> index 051ed5f6fc93..ded399f380d9 100644
> --- a/net/sysctl_net.c
> +++ b/net/sysctl_net.c
> @@ -58,16 +58,11 @@ static void net_ctl_set_ownership(struct ctl_table_header *head,
> kuid_t *uid, kgid_t *gid)
> {
> struct net *net = container_of(head->set, struct net, sysctls);
> - kuid_t ns_root_uid;
> - kgid_t ns_root_gid;
> + kuid_t ns_root_uid = make_kuid(net->user_ns, 0);
> + kgid_t ns_root_gid = make_kgid(net->user_ns, 0);
>
> - ns_root_uid = make_kuid(net->user_ns, 0);
As a fix I would prefer you would keep it minimal. e.g. just replace
the if with the ternary operator or just add an 'else' branch.
Cheers,
Paolo
Powered by blists - more mailing lists