[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <17f6eb65-e6c0-47be-b88a-e33819bbd66f@huawei.com>
Date: Tue, 13 Aug 2024 11:48:44 +0800
From: chenridong <chenridong@...wei.com>
To: Michal Koutný <mkoutny@...e.com>
CC: <tj@...nel.org>, <lizefan.x@...edance.com>, <hannes@...xchg.org>,
<longman@...hat.com>, <cgroups@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH -next 2/2] cgroup: Disallow delegatee to write all
interfaces outsize of cgroup ns
On 2024/8/13 0:57, Michal Koutný wrote:
> Hello.
>
> On Mon, Aug 12, 2024 at 07:37:46AM GMT, Chen Ridong <chenridong@...wei.com> wrote:
>> cd /sys/fs/cgroup
>> echo '+pids' > cgroup.subtree_control
>> mkdir dlgt_grp_ns
>> echo '+pids' > dlgt_grp_ns/cgroup.subtree_control
>> mkdir dlgt_grp_ns/dlgt_grp_ns1
>> echo $$ > dlgt_grp_ns/dlgt_grp_ns1/cgroup.procs
>> echo 200 > dlgt_grp_ns/dlgt_grp_ns1/pids.max
>> unshare -Cm /bin/bash
>> echo max > /dlgt_grp_ns1/pids.max // Permission denied
>> echo -pids > dlgt_grp_ns/cgroup.subtree_control // pids was unlimited now
>
> You could also have increased the ancestral limit (if there was any)
> echo max > dlgt_grp_ns/pids.max // similarly allowed
>
> If you're a root (or otherwise have sufficient permissions) and you can
> _see_ an ancestral cgroup, you can write to its attributes according to
> permissions. Thus the delegation works via cgroup ns (in)visibility but
> cgroup ns root is visible on both sides of the boundary hence the extra
> check.
>
> I can imagine that a container runtime process could enter the target
> cgroup ns while keeping visibility to the original cgroup ns and do some
> tuning on it before it drops any pointers to the original cgroup ns and
> exec's delegatee's workload. (But it's only my imagination to illustrate
> that this may be a breaking change.)
>
Yes, this is a change. However, the process within the container can
turn on subsystems , they can also turn off subsystems , which may not
only affect the host machine but may also affect other containers. I
believe this poses a danger.
> OTOH, I can see why this would be consistent with the migration rules
> that exist between sides of cgroup ns, so this could work if it was
> hidden behind (another) mount option like 'nsdelegate2' :-p
>
Hello, Michal, I am sorry I can not get your point.
Did you mean this issue should be fixed with another mount option
'nsdelegate2'? Or is the current patch acceptable?
>
>> @@ -4134,8 +4134,10 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf,
>> * cgroup.procs, cgroup.threads and cgroup.subtree_control.
>> */
>> if ((cgrp->root->flags & CGRP_ROOT_NS_DELEGATE) &&
>> - !(cft->flags & CFTYPE_NS_DELEGATABLE) &&
>> - ctx->ns != &init_cgroup_ns && ctx->ns->root_cset->dfl_cgrp == cgrp)
>> + ctx->ns != &init_cgroup_ns &&
>> + (!cgroup_is_descendant(cgrp, ctx->ns->root_cset->dfl_cgrp) ||
>> + (!(cft->flags & CFTYPE_NS_DELEGATABLE) &&
>> + ctx->ns->root_cset->dfl_cgrp == cgrp)))
>> return -EPERM;
>
> Could you please also update the comment above, to describe the boundary
> vs subtree delegation?
> Thanks,
> Michal
Sure.
Thanks
Ridong
Powered by blists - more mailing lists