linux-kernel - Re: [RFC 04/13] mm, page_alloc: restructure direct compaction handling in slowpath

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57358C0A.4020002@suse.cz>
Date:	Fri, 13 May 2016 10:10:50 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Rik van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Mel Gorman <mgorman@...hsingularity.net>,
	Johannes Weiner <hannes@...xchg.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>
Subject: Re: [RFC 04/13] mm, page_alloc: restructure direct compaction
 handling in slowpath

On 05/12/2016 03:29 PM, Michal Hocko wrote:
> On Tue 10-05-16 09:35:54, Vlastimil Babka wrote:
>> This patch attempts to restructure the code with only minimal functional
>> changes. The call to the first compaction and THP-specific checks are now
>> placed above the retry loop, and the "noretry" direct compaction is removed.
>>
>> The initial compaction is additionally restricted only to costly orders, as we
>> can expect smaller orders to be held back by watermarks, and only larger orders
>> to suffer primarily from fragmentation. This better matches the checks in
>> reclaim's shrink_zones().
>>
>> There are two other smaller functional changes. One is that the upgrade from
>> async migration to light sync migration will always occur after the initial
>> compaction.
>
> I do not think this belongs to the patch. There are two reasons. First
> we do not need to do potentially more expensive sync mode when async is
> able to make some progress and the second

My concern was that __GFP_NORETRY non-costly allocations wouldn't 
otherwise get a MIGRATE_SYNC_LIGHT pass at all. Previously they would 
get it in the noretry: label. So do you think it's a corner case not 
worth caring about? Alternatively we could also remove the 'restriction 
of initial async compaction to costly orders' from this patch and apply 
it separately later. That would also result in the flip to sync_light 
after the initial async attempt for these allocations.

> is that with the currently
> fragile compaction implementation this might reintroduce the premature
> OOM for order-2 requests reported by Hugh. Please see
> http://lkml.kernel.org/r/alpine.LSU.2.11.1604141114290.1086@eggly.anvils

Hmm IIRC that involved some wrong conflict resolution in mmotm? I don't 
remember what the code exactly did look like, but wasn't the problem 
that the initial compaction was async, then the left-over hunk changed 
migration_mode to sync_light, and then should_compact_retry() thought 
"oh we already failed sync_light, return false" when in fact the 
sync_light compaction never happened? Otherwise I don't see how 
switching to sync_light "too early" could lead to premature OOMs.

> Your later patch (which I haven't reviewed yet) is then changing this
> considerably

Yes, my other concern with should_compact_retry() after your "mm, oom: 
protect !costly allocations some more" is that relying on 
compaction_failed() to upgrade the migration mode is unreliable. Async 
compaction can easily keep returning as contended, so might never see 
the COMPACT_COMPLETE result, if it's e.g. limited to nodes without a 
really small zone such as ZONE_DMA.

> but I think it would be safer to not touch the migration
> mode in this - mostly cleanup - patch.
>
>> This is how it has been until recent patch "mm, oom: protect
>> !costly allocations some more", which introduced upgrading the mode based on
>> COMPACT_COMPLETE result, but kept the final compaction always upgraded, which
>> made it even more special. It's better to return to the simpler handling for
>> now, as migration modes will be further modified later in the series.
>>
>> The second change is that once both reclaim and compaction declare it's not
>> worth to retry the reclaim/compact loop, there is no final compaction attempt.
>> As argued above, this is intentional. If that final compaction were to succeed,
>> it would be due to a wrong retry decision, or simply a race with somebody else
>> freeing memory for us.
>>
>> The main outcome of this patch should be simpler code. Logically, the initial
>> compaction without reclaim is the exceptional case to the reclaim/compaction
>> scheme, but prior to the patch, it was the last loop iteration that was
>> exceptional. Now the code matches the logic better. The change also enable the
>> following patches.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>
> Other than the above thing I like this patch.
> Acked-by: Michal Hocko <mhocko@...e.com>

Thanks!