[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <572805FD.9080202@suse.de>
Date: Tue, 3 May 2016 11:59:25 +1000
From: Aleksa Sarai <asarai@...e.de>
To: James Bottomley <James.Bottomley@...senPartnership.com>,
Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
Johannes Weiner <hannes@...xchg.org>
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
dev@...ncontainers.org, Aleksa Sarai <cyphar@...har.com>
Subject: Re: [PATCH v2] cgroup: allow management of subtrees by new cgroup
namespaces
>> Change the mode of the cgroup directory for each cgroup association,
>> allowing the process to create subtrees and modify the limits of the
>> subtrees *without* allowing the process to modify its own limits. Due
>> to the cgroup core restrictions and unix permission model, this
>> allows for processes to create new subtrees without breaking the
>> cgroup limits for the process.
>
> Actually, that's not really what this patch does. If you unshare
> without having created any cgroups, it sets the other permission of the
> entire top level hierarchy to o+rwx:
While that is odd, it makes sense (because that's the "current cgroup"
you are in). But I agree with your point that this patch is less than ideal.
> ironically, this now makes the root group a permission denier (at least
> for my distribution), because if I were in the root group (and not
> root), the r-x on the group would rule the rwx on other ... I really
> don't think that sounds correct.
You're right, that's odd. I'm confused why your root cgroups have u-w
though.
>
> Perhaps what you should to be arguing then that the default permissions
> of the cgroup directories need to be all rwx for everyone and then your
> patch becomes unnecessary?
I don't think that would be the nicest way of dealing with this (then a
process can make very large numbers of cgroups all over the tree, which
might not cause huge issues but would still be a pain for administrators
and systemds alike).
> Alternatively, if the desire is fully to virtualize /sys/fs/cgroups,
> then I think we have to decide how that would happen. I think the
> default requirements would be that a pid namespace be established (so
> only the tasks in that pid namespace would be able to be controlled by
> the cgroup namespace. That, I think requires that any given cgroup
> namespace "own" a pid namespace (being the one present when it was
> created) but that it only gets a new virtual set of directories owned
> by the userns owner if there's a pid namespace established for the
> cgroup and cgroup->user_ns == pid_ns->user_ns (meaning we established a
> user ns then a pid one then a cgroup one, so it's now safe to treat
> root in the user_ns as owning the virtualized cgroup directories).
I know this is probably a stupid question, but why couldn't we just
compare the user_ns with the tcred->user_ns? Or are you worried about a
process in a cgroup namespace moving processes to a subtree that isn't
in the same pid namespace (even though they're in the same user
namespace)? I don't mind implementing that this way (although we'd have
to change a bunch of the checks with pid_ns to use the
cgroup_ns->pid_ns), I'm just wondering if it's necessary.
> We could do this in the same way that proc gets virtualized after
> remounting (in a new mount namespace) on fork into a pid namespace.
I actually really like this idea. I'll get to work on it.
--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/
Powered by blists - more mailing lists