lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170726083017.3yzeucmi7lcj46qd@esperanza>
Date:   Wed, 26 Jul 2017 11:30:17 +0300
From:   Vladimir Davydov <vdavydov.dev@...il.com>
To:     Roman Gushchin <guro@...com>
Cc:     linux-mm@...ck.org, Tejun Heo <tj@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>, kernel-team@...com,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm, memcg: reset low limit during memcg offlining

On Tue, Jul 25, 2017 at 01:31:13PM +0100, Roman Gushchin wrote:
> On Tue, Jul 25, 2017 at 03:05:37PM +0300, Vladimir Davydov wrote:
> > On Tue, Jul 25, 2017 at 12:40:47PM +0100, Roman Gushchin wrote:
> > > A removed memory cgroup with a defined low limit and some belonging
> > > pagecache has very low chances to be freed.
> > > 
> > > If a cgroup has been removed, there is likely no memory pressure inside
> > > the cgroup, and the pagecache is protected from the external pressure
> > > by the defined low limit. The cgroup will be freed only after
> > > the reclaim of all belonging pages. And it will not happen until
> > > there are any reclaimable memory in the system. That means,
> > > there is a good chance, that a cold pagecache will reside
> > > in the memory for an undefined amount of time, wasting
> > > system resources.
> > > 
> > > Fix this issue by zeroing memcg->low during memcg offlining.
> > > 
> > > Signed-off-by: Roman Gushchin <guro@...com>
> > > Cc: Tejun Heo <tj@...nel.org>
> > > Cc: Johannes Weiner <hannes@...xchg.org>
> > > Cc: Michal Hocko <mhocko@...nel.org>
> > > Cc: Vladimir Davydov <vdavydov.dev@...il.com>
> > > Cc: kernel-team@...com
> > > Cc: cgroups@...r.kernel.org
> > > Cc: linux-mm@...ck.org
> > > Cc: linux-kernel@...r.kernel.org
> > > ---
> > >  mm/memcontrol.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index aed11b2d0251..2aa204b8f9fd 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -4300,6 +4300,8 @@ static void mem_cgroup_css_offline(struct cgroup_subsys_state *css)
> > >  	}
> > >  	spin_unlock(&memcg->event_list_lock);
> > >  
> > > +	memcg->low = 0;
> > > +
> > >  	memcg_offline_kmem(memcg);
> > >  	wb_memcg_offline(memcg);
> > >  
> > 
> > We already have that - see mem_cgroup_css_reset().
> 
> Hm, I see...
> 
> But are you sure, that calling mem_cgroup_css_reset() from offlining path
> is always a good idea?
> 
> As I understand, css_reset() callback is intended to _completely_ disable all
> limits, as if there were no cgroup at all.

But that's exactly what cgroup offline is: deletion of a cgroup as if it
never existed. The fact that we leave the zombie dangling until all
pages charged to the cgroup are gone is an implementation detail. IIRC
we would "reparent" those charges and delete the mem_cgroup right away
if it were not inherently racy.

> And it's main purpose to be called
> when controllers are detached from the hierarhy.
> 
> Offlining is different: some limits make perfect sence after offlining
> (e.g. we want to limit the writeback speed), and other might be tweaked
> (e.g. we can set soft limit to prioritize reclaiming of abandoned cgroups).

The user can't tweak limits of an offline cgroup, because the cgroup
directory no longer exist. So IMHO resetting all limits is reasonable.
If you want to keep the cgroup limits effective, you shouldn't have
deleted it in the first place, I suppose.

You might also want to check out this:

  http://www.spinics.net/lists/linux-mm/msg102995.html

> 
> So, I'd prefer to move this code to the offlining callback,
> and not to call css_reset.
> 
> But, anyway, thanks for pointing at the mem_cgroup_css_reset().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ