[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ynv7+VG+T2y9rpdk@carbon>
Date: Wed, 11 May 2022 11:10:01 -0700
From: Roman Gushchin <roman.gushchin@...ux.dev>
To: Michal Koutný <mkoutny@...e.com>
Cc: Vasily Averin <vvs@...nvz.org>, Vlastimil Babka <vbabka@...e.cz>,
Shakeel Butt <shakeelb@...gle.com>, kernel@...nvz.org,
Florian Westphal <fw@...len.de>, linux-kernel@...r.kernel.org,
Michal Hocko <mhocko@...e.com>, cgroups@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Tejun Heo <tj@...nel.org>
Subject: Re: kernfs memcg accounting
On Wed, May 11, 2022 at 06:34:39PM +0200, Michal Koutny wrote:
> On Tue, May 10, 2022 at 08:06:24PM -0700, Roman Gushchin <roman.gushchin@...ux.dev> wrote:
> > My primary goal was to apply the memory pressure on memory cgroups with a lot
> > of (dying) children cgroups. On a multi-cpu machine a memory cgroup structure
> > is way larger than a page, so a cgroup which looks small can be really large
> > if we calculate the amount of memory taken by all children memcg internals.
> >
> > Applying this pressure to another cgroup (e.g. the one which contains systemd)
> > doesn't help to reclaim any pages which are pinning the dying cgroups.
>
> Just a note -- this another usecase of cgroups created from within the
> subtree (e.g. a container). I agree that cgroup-manager/systemd case is
> also valid (as dying memcgs may accumulate after a restart).
>
> memcgs with their retained state with footprint are special.
>
> > For other controllers (maybe blkcg aside, idk) it shouldn't matter, because
> > there is no such problem there.
> >
> > For consistency reasons I'd suggest to charge all *large* allocations
> > (e.g. percpu) to the parent cgroup. Small allocations can be ignored.
>
> Strictly speaking, this would mean that any controller would have on
> implicit dependency on the memory controller (such as io controller
> has).
> In the extreme case even controller-less hierarchy would have such a
> requirement (for precise kernfs_node accounting).
> Such a dependency is not enforceable on v1 (with various topologies of
> different hierarchies).
>
> Although, I initially favored the consistency with memory controller too,
> I think it's simpler to charge to the creator's memcg to achieve
> consistency across v1 and v2 :-)
Ok, v1/v2 consistency is a valid point.
As I said, I'm fine with both options, it shouldn't matter that much
for anything except the memory controller: cgroup internal objects are not
that large and the total memory footprint is usually small unless we have
a lot of (dying) sub-cgroups. From my experience no other controllers
should be affected (blkcg was affected due to a cgwb reference, but should
be fine now), so it's not an issue at all.
Thanks!
Powered by blists - more mailing lists