linux-kernel - Re: [PATCH mm v2] memcg: notify about global mem_cgroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YrpVP6rpYGFsl3jj@castle>
Date:   Mon, 27 Jun 2022 18:11:27 -0700
From:   Roman Gushchin <roman.gushchin@...ux.dev>
To:     Vasily Averin <vvs@...nvz.org>
Cc:     Muchun Song <songmuchun@...edance.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Michal Koutný <mkoutny@...e.com>,
        Michal Hocko <mhocko@...e.com>, kernel@...nvz.org,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Cgroups <cgroups@...r.kernel.org>
Subject: Re: [PATCH mm v2] memcg: notify about global mem_cgroup_id space
 depletion

On Mon, Jun 27, 2022 at 09:49:18AM +0300, Vasily Averin wrote:
> On 6/27/22 06:23, Muchun Song wrote:
> > If the caller can know -ENOSPC is returned by mkdir(), then I
> > think the user (perhaps systemd) is the best place to throw out the
> > error message instead of in the kernel log. Right?
> 
> Such an incident may occur inside the container.
> OpenVZ nodes can host 300-400 containers, and the host admin cannot
> monitor guest logs. the dmesg message is necessary to inform the host
> owner that the global limit has been reached, otherwise he can
> continue to believe that there are no problems on the node.

Why this is happening? It's hard to believe someone really needs that
many cgroups. Is this when somebody fails to delete old cgroups?

I wanted to say that it's better to introduce a memcg event, but then
I realized it's probably not worth the wasted space. Is this a common
scenario?

I think a better approach will be to add a cgroup event (displayed via
cgroup.events) about reaching the maximum limit of cgroups. E.g.
cgroups.events::max_nr_reached. Then you can set cgroup.max.descendants
to some value below memcg_id space size. It's more work, but IMO it's
a better way to communicate this event. As a bonus, you can easily
get an idea which cgroup depletes the limit.

Thanks!