linux-kernel - Re: [RFC 13/13] mm, compaction: fix and improve watermark handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160516123053.GI23146@dhcp22.suse.cz>
Date:	Mon, 16 May 2016 14:30:53 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Vlastimil Babka <vbabka@...e.cz>
Cc:	linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Rik van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	Johannes Weiner <hannes@...xchg.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC 13/13] mm, compaction: fix and improve watermark handling

On Mon 16-05-16 11:50:22, Vlastimil Babka wrote:
> On 05/16/2016 11:25 AM, Michal Hocko wrote:
> > On Tue 10-05-16 09:36:03, Vlastimil Babka wrote:
> > > Compaction has been using watermark checks when deciding whether it was
> > > successful, and whether compaction is at all suitable. There are few problems
> > > with these checks.
> > > 
> > > - __compact_finished() uses low watermark in a check that has to pass if
> > >    the direct compaction is to finish and allocation should succeed. This is
> > >    too pessimistic, as the allocation will typically use min watermark. It
> > >    may happen that during compaction, we drop below the low watermark (due to
> > >    parallel activity), but still form the target high-order page. By checking
> > >    against low watermark, we might needlessly continue compaction. After this
> > >    patch, the check uses direct compactor's alloc_flags to determine the
> > >    watermark, which is effectively the min watermark.
> > 
> > OK, this makes some sense. It would be great if we could have at least
> > some clarification why the low wmark has been used previously. Probably
> > Mel can remember?
> > 
> > > - __compaction_suitable has the same issue in the check whether the allocation
> > >    is already supposed to succeed and we don't need to compact. Fix it the same
> > >    way.
> > > 
> > > - __compaction_suitable() then checks the low watermark plus a (2 << order) gap
> > >    to decide if there's enough free memory to perform compaction. This check
> > 
> > And this was a real head scratcher when I started looking into the
> > compaction recently. Why do we need to be above low watermark to even
> > start compaction.
> 
> Hmm, above you said you're fine with low wmark (maybe after clarification).
> I don't know why it was used, can only guess.

Yes I can imagine this would be a good backoff for costly orders without
__GFP_REPEAT.

> > Compaction uses additional memory only for a short
> > period of time and then releases the already migrated pages.
> 
> As for the 2 << order gap. I can imagine that e.g. order-5 compaction (32
> pages) isolates 20 pages for migration and starts looking for free pages. It
> collects 19 free pages and then reaches an order-4 free page. Splitting that
> page to collect it would result in 19+16=35 pages isolated, thus exceed the
> 1 << order gap, and fail. With 2 << order gap, chances of this happening are
> reduced.

OK, fair enough but that sounds like a case which is not worth optimize
and introduce a subtle code for.

[...]

> > > - __isolate_free_page uses low watermark check to decide if free page can be
> > >    isolated. It also doesn't use ALLOC_CMA, so add it for the same reasons.
> > 
> > Why do we check the watermark at all? What would happen if this obscure
> > if (!is_migrate_isolate(mt)) was gone? I remember I put some tracing
> > there and it never hit for me even when I was testing close to OOM
> > conditions. Maybe an earlier check bailed out but this code path looks
> > really obscure so it should either deserve a large fat comment or to
> > die.
> 
> The check is there so that compaction doesn't exhaust memory below reserves
> during its work, just like any other non-privileged allocation.

Hmm. OK this is a fair point. I would expect that the reclaim preceeding
the compaction would compensate for the temporarily used memory but it
is true that a) we might be in the optimistic async compaction which
happens _before_ the reclaim and b) the reclaim might be not effective
enough so some throttling is indeed appropriate.

I guess you do not want to rely on throttling only at the beginning of
the compaction because it would be too racy, which would be true. So I
guess it would be indeed safer to check for the watermark both when we
attempt to compact and when we isolate free pages. Can we at least use a
common helper so that we know that those checks are done same way?
 
 Thanks!
-- 
Michal Hocko
SUSE Labs