linux-kernel - Re: [RFC PATCH 0/2] support cgroup pool in v1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <YT97PAm6kaecvXLX@slm.duckdns.org>
Date:   Mon, 13 Sep 2021 06:24:28 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Christian Brauner <christian.brauner@...ntu.com>
Cc:     "taoyi.ty" <escape@...ux.alibaba.com>,
        Greg KH <gregkh@...uxfoundation.org>, lizefan.x@...edance.com,
        hannes@...xchg.org, mcgrof@...nel.org, keescook@...omium.org,
        yzaikin@...gle.com, linux-kernel@...r.kernel.org,
        cgroups@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        shanpeic@...ux.alibaba.com
Subject: Re: [RFC PATCH 0/2] support cgroup pool in v1

Hello,

On Mon, Sep 13, 2021 at 04:20:59PM +0200, Christian Brauner wrote:
> Afaict, there is currently now way to prevent the deletion of empty
> cgroups, especially newly created ones. So for example, if I have a
> cgroup manager that prunes the cgroup tree whenever they detect empty
> cgroups they can delete cgroups that were pre-allocated. This is
> something we have run into before.

systemd doesn't mess with cgroups behind a delegation point.

> A related problem is a crashed or killed container manager 
> (segfault, sigkill, etc.). It might not have had the chance to cleanup
> cgroups it allocated for the container. If the container manager is
> restarted it can't reuse the existing cgroup it found because it has no
> way of guaranteeing whether in between the time it crashed and got
> restarted another program has just created a cgroup with the same name.
> We usually solve this by just creating another cgroup with an index
> appended until we we find an unallocated one setting an arbitrary cut
> off point until we require manual intervention by the user (e.g. 1000).
> 
> Right now iirc, one can rmdir() an empty cgroup while someone still
> holds a file descriptor open for it. This can lead to situation where a
> cgroup got created but before moving into the cgroup (via clone3() or
> write()) someone else has deleted it. What would already be helpful is
> if one had a way to prevent the deletion of cgroups when someone still
> has an open reference to it. This would allow a pool of cgroups to be
> created that can't simply be deleted.

The above are problems common for any entity managing cgroup hierarchy.
Beyond the permission and delegation based access control, cgroup doesn't
have a mechanism to grant exclusive managerial operations to a specific
application. It's the userspace's responsibility to coordinate these
operations like in most other kernel interfaces.

Thanks.

-- 
tejun