lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebf480701cd22da00c89c5b1b00d31be95ff8e4d.camel@redhat.com>
Date: Thu, 14 Dec 2023 10:37:01 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Maciej Żenczykowski <maze@...gle.com>, Maciej
	Żenczykowski
	 <zenczykowski@...il.com>
Cc: Linux Network Development Mailing List <netdev@...r.kernel.org>,  "David
 S . Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Flavio Crisciani <fcrisciani@...gle.com>,
 "Theodore Y. Ts'o" <tytso@...gle.com>, "Eric W. Biederman"
 <ebiederm@...ssion.com>, Dmitry Torokhov <dmitry.torokhov@...il.com>
Subject: Re: [PATCH] net: sysctl: fix edge case wrt. sysctl write access

On Sun, 2023-12-10 at 03:10 -0800, Maciej Żenczykowski wrote:
> The clear intent of net_ctl_permissions() is that having CAP_NET_ADMIN
> grants write access to networking sysctls.
> 
> However, it turns out there is an edge case where this is insufficient:
> inode_permission() has an additional check on HAS_UNMAPPED_ID(inode)
> which can return -EACCES and thus block *all* write access.
> 
> Note: AFAICT this check is wrt. the uid/gid mapping that was
> active at the time the filesystem (ie. proc) was mounted.
> 
> In order for this check to not fail, we need net_ctl_set_ownership()
> to set valid uid/gid.  It is not immediately clear what value
> to use, nor what values are guaranteed to work.
> It does make sense that /proc/sys/net appear to be owned by root
> from within the netns owning userns.  As such we only modify
> what happens if the code fails to map uid/gid 0.
> Currently the code just fails to do anything, which in practice
> results in using the zeroes of freshly allocated memory,
> and we thus end up with global root.
> With this change we instead use the uid/gid of the owning userns.
> While it is probably (?) theoretically possible for this to *also*
> be unmapped from the /proc filesystem's point of view, this seems
> much less likely to happen in practice.
> 
> The old code is observed to fail in a relatively complex setup,
> within a global root created user namespace with selectively
> mapped uid/gids (not including global root) and /proc mounted
> afterwards (so this /proc mount does not have global root mapped).
> Within this user namespace another non privileged task creates
> a new user namespace, maps it's own uid/gid (but not uid/gid 0),
> and then creates a network namespace.  It cannot write to networking
> sysctls even though it does have CAP_NET_ADMIN.

I'm wondering if this specific scenario should be considered a setup 
issue, and should be solved with a different configuration? I would
love to hear others opinions!

> This is because net_ctl_set_ownership fails to map uid/gid 0
> (because uid/gid 0 are *not* mapped in the owning 2nd level user_ns),
> and falls back to global root.
> But global root is not mapped in the 1st level user_ns,
> which was inherited by the /proc mount, and thus fails...
> 
> Note: the uid/gid of networking sysctls is of purely superficial
> importance, outside of this UNMAPPED check, it does not actually
> affect access, and only affects display.
> 
> Access is always based on whether you are *global* root uid
> (or have CAP_NET_ADMIN over the netns) for user write access bits
> (or are in *global* root gid for group write access bits).
> 
> Cc: Flavio Crisciani <fcrisciani@...gle.com>
> Cc: "Theodore Y. Ts'o" <tytso@...gle.com>
> Cc: "Eric W. Biederman" <ebiederm@...ssion.com>
> Cc: Dmitry Torokhov <dmitry.torokhov@...il.com>
> Fixes: e79c6a4fc923 ("net: make net namespace sysctls belong to container's owner")
> Signed-off-by: Maciej Żenczykowski <maze@...gle.com>
> ---
>  net/sysctl_net.c | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/net/sysctl_net.c b/net/sysctl_net.c
> index 051ed5f6fc93..ded399f380d9 100644
> --- a/net/sysctl_net.c
> +++ b/net/sysctl_net.c
> @@ -58,16 +58,11 @@ static void net_ctl_set_ownership(struct ctl_table_header *head,
>  				  kuid_t *uid, kgid_t *gid)
>  {
>  	struct net *net = container_of(head->set, struct net, sysctls);
> -	kuid_t ns_root_uid;
> -	kgid_t ns_root_gid;
> +	kuid_t ns_root_uid = make_kuid(net->user_ns, 0);
> +	kgid_t ns_root_gid = make_kgid(net->user_ns, 0);
>  
> -	ns_root_uid = make_kuid(net->user_ns, 0);

As a fix I would prefer you would keep it minimal. e.g. just replace
the if with the ternary operator or just add an 'else' branch.

Cheers,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ