linux-kernel - Re: [PATCH 1/4] memcg, mm: introduce lowlimit reclaim

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140506165150.GI19914@cmpxchg.org>
Date:	Tue, 6 May 2014 12:51:50 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Greg Thelen <gthelen@...gle.com>,
	Michel Lespinasse <walken@...gle.com>,
	Tejun Heo <tj@...nel.org>, Hugh Dickins <hughd@...gle.com>,
	Roman Gushchin <klamm@...dex-team.ru>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
	Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH 1/4] memcg, mm: introduce lowlimit reclaim

On Tue, May 06, 2014 at 06:12:56PM +0200, Michal Hocko wrote:
> I am adding Rik to CC (sorry to put you in the middle of a thread -
> we have started here: https://lkml.org/lkml/2014/4/28/237). You were
> stressing out risks of using lowlimit as a hard guarantee at LSF. Could
> you repeat your concerns here as well, please?
> 
> Short summary:
> We are basically discussing how to handle lowlimit overcommit situation,
> when no group is reclaimable because it either doesn't have any pages on
> the LRU or it is bellow its lowlimit (aka guaranteed memory).
> 
> The solution proposed in this series is to fallback and reclaim
> everybody rather than OOM with a note that if somebody really needs an
> OOM then we can add a per-memcg knob which tells whether to fallback or oom.
> 
> Previously I was suggesting OOM as a default but I realized that this
> might be too risky for the default behavior although I can see some
> point in that behavior as well (it would allow to have a group which
> would never reclaim memory and rather go OOM where the memory demand can
> be handled more specifically). I do not have any call for such a hard
> guarantee requirement usecase now and it would be quite trivial to build
> it on top of the more relaxed implementation so I am more inclined to
> the fallback default now.
> 
> More comments inlined below.
> 
> On Tue 06-05-14 11:21:12, Johannes Weiner wrote:
> > On Tue, May 06, 2014 at 04:32:42PM +0200, Michal Hocko wrote:
> > > On Tue 06-05-14 09:29:32, Johannes Weiner wrote:
> > > > On Fri, May 02, 2014 at 06:00:56PM -0400, Johannes Weiner wrote:
> > > > > On Fri, May 02, 2014 at 06:49:30PM +0200, Michal Hocko wrote:
> > > > > > On Fri 02-05-14 11:58:05, Johannes Weiner wrote:
> > > > > > > This is not even guarantees anymore, but rather another reclaim
> > > > > > > prioritization scheme with best-effort semantics.  That went over
> > > > > > > horribly with soft limits, and I don't want to repeat this.
> > > > > > > 
> > > > > > > Overcommitting on guarantees makes no sense, and you even agree you
> > > > > > > are not interested in it.  We also agree that we can always add a knob
> > > > > > > later on to change semantics when an actual usecase presents itself,
> > > > > > > so why not start with the clear and simple semantics, and the simpler
> > > > > > > implementation?
> > > > > > 
> > > > > > So you are really preferring an OOM instead? That was the original
> > > > > > implementation posted at the end of last year and some people
> > > > > > had concerns about it. This is the primary reason I came up with a
> > > > > > weaker version which fallbacks rather than OOM.
> > > > > 
> > > > > I'll dig through the archives on this then, thanks.
> > > > 
> > > > The most recent discussion on this I could find was between you and
> > > > Greg, where the final outcome was (excerpt):
> > > > 
> > > > ---
> > > > 
> > > > From: Greg Thelen <gthelen@...gle.com>
> > > > To: Michal Hocko <mhocko@...e.cz>
> > > > Cc: linux-mm@...ck.org,  Johannes Weiner <hannes@...xchg.org>,  Andrew Morton <akpm@...ux-foundation.org>,  KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,  LKML <linux-kernel@...r.kernel.org>,  Ying Han <yinghan@...gle.com>,  Hugh Dickins <hughd@...gle.com>,  Michel Lespinasse <walken@...gle.com>,  KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,  Tejun Heo <tj@...nel.org>
> > > > Subject: Re: [RFC 0/4] memcg: Low-limit reclaim
> > > > References: <1386771355-21805-1-git-send-email-mhocko@...e.cz>
> > > > 	<xr93sis6obb5.fsf@...elen.mtv.corp.google.com>
> > > > 	<20140130123044.GB13509@...p22.suse.cz>
> > > > 	<xr931tzphu50.fsf@...elen.mtv.corp.google.com>
> > > > 	<20140203144341.GI2495@...p22.suse.cz>
> > > > Date: Mon, 03 Feb 2014 17:33:13 -0800
> > > > Message-ID: <xr93zjm7br1i.fsf@...elen.mtv.corp.google.com>
> > > > List-ID: <linux-mm.kvack.org>
> > > > 
> > > > On Mon, Feb 03 2014, Michal Hocko wrote:
> > > > 
> > > > > On Thu 30-01-14 16:28:27, Greg Thelen wrote:
> > > > >> But this soft_limit,priority extension can be added later.
> > > > >
> > > > > Yes, I would like to have the strong semantic first and then deal with a
> > > > > weaker form. Either by a new limit or a flag.
> > > > 
> > > > Sounds good.
> > > > 
> > > > ---
> > > > 
> > > > So I think everybody involved in the discussions so far are preferring
> > > > a hard guarantee, and then later, if needed, to either add a knob to
> > > > make it a soft guarantee or to actually implement a usable soft limit.
> > > 
> > > I am afraid the most of that discussion happened off-list :( Sadly not
> > > much of a discussion happened on the list.
> > 
> > Time to do it now, then :)
> > 
> > > Sorry I should have been specific and mention that the discussions
> > > happened at LSF and partly at the KS.
> > > 
> > > The strongest point was made by Rik when he claimed that memcg is not
> > > aware of memory zones and so one memcg with lowlimit larger than the
> > > size of a zone can eat up that zone without any way to free it.
> > 
> > But who actually cares if an individual zone can be reclaimed?
> > 
> > Userspace allocations can fall back to any other zone.  Unless there
> > are hard bindings, but hopefully nobody binds a memcg to a node that
> > is smaller than that memcg's guarantee. 
> 
> The protected group might spill over to another group and eat it when
> another group would be simply pushed out from the node it is bound to.

I don't really understand the point you're trying to make.

> > And while the pages are not
> > reclaimable, they are still movable, so the NUMA balancer is free to
> > correct any allocation mistakes later on.
> 
> Do we want to depend on NUMA balancer, though?

You're missing my point.

This is about which functionality of the system is actually impeded by
having large portions of a zone unreclaimable.  Freeing pages in a
zone is means to an end, not an end in itself.

We wouldn't depend on the NUMA balancer to "free" a zone, I'm just
saying that the NUMA balancer would be unaffected by a zone full of
unreclaimable pages, as long as they are movable.

So who exactly cares about the ability to reclaim individual zones and
how is it a new type of problem compared to existing unreclaimable but
movable memory?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/