lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ex5gnhcoobbw74se4uchhqj2lsrcjx5bsh6m5lva2xmujv7uae@34vwukkwhkbc>
Date: Mon, 12 Aug 2024 18:57:06 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Chen Ridong <chenridong@...wei.com>
Cc: tj@...nel.org, lizefan.x@...edance.com, hannes@...xchg.org, 
	longman@...hat.com, cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -next 2/2] cgroup: Disallow delegatee to write all
 interfaces outsize of cgroup ns

Hello.

On Mon, Aug 12, 2024 at 07:37:46AM GMT, Chen Ridong <chenridong@...wei.com> wrote:
> cd /sys/fs/cgroup
> echo '+pids' > cgroup.subtree_control
> mkdir dlgt_grp_ns
> echo '+pids' > dlgt_grp_ns/cgroup.subtree_control
> mkdir dlgt_grp_ns/dlgt_grp_ns1
> echo $$ > dlgt_grp_ns/dlgt_grp_ns1/cgroup.procs
> echo 200 > dlgt_grp_ns/dlgt_grp_ns1/pids.max
> unshare -Cm /bin/bash
> echo max > /dlgt_grp_ns1/pids.max // Permission denied
> echo -pids > dlgt_grp_ns/cgroup.subtree_control // pids was unlimited now

You could also have increased the ancestral limit (if there was any)
echo max > dlgt_grp_ns/pids.max // similarly allowed

If you're a root (or otherwise have sufficient permissions) and you can
_see_ an ancestral cgroup, you can write to its attributes according to
permissions. Thus the delegation works via cgroup ns (in)visibility but
cgroup ns root is visible on both sides of the boundary hence the extra
check.

I can imagine that a container runtime process could enter the target
cgroup ns while keeping visibility to the original cgroup ns and do some
tuning on it before it drops any pointers to the original cgroup ns and
exec's delegatee's workload. (But it's only my imagination to illustrate
that this may be a breaking change.)

OTOH, I can see why this would be consistent with the migration rules
that exist between sides of cgroup ns, so this could work if it was
hidden behind (another) mount option like 'nsdelegate2' :-p


> @@ -4134,8 +4134,10 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
>  	 * cgroup.procs, cgroup.threads and cgroup.subtree_control.
>  	 */
>  	if ((cgrp->root->flags & CGRP_ROOT_NS_DELEGATE) &&
> -	    !(cft->flags & CFTYPE_NS_DELEGATABLE) &&
> -	    ctx->ns != &init_cgroup_ns && ctx->ns->root_cset->dfl_cgrp == cgrp)
> +		ctx->ns != &init_cgroup_ns &&
> +		(!cgroup_is_descendant(cgrp, ctx->ns->root_cset->dfl_cgrp) ||
> +			(!(cft->flags & CFTYPE_NS_DELEGATABLE) &&
> +			ctx->ns->root_cset->dfl_cgrp == cgrp)))
>  		return -EPERM;

Could you please also update the comment above, to describe the boundary
vs subtree delegation?

Thanks,
Michal

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ