lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <570B9E50.9040000@suse.cz>
Date:	Mon, 11 Apr 2016 14:53:36 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Michal Hocko <mhocko@...nel.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
	Joonsoo Kim <js1304@...il.com>,
	Hillf Danton <hillf.zj@...baba-inc.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 06/11] mm, compaction: distinguish between full and
 partial COMPACT_COMPLETE

On 04/11/2016 02:46 PM, Michal Hocko wrote:
>> This assumes that migrate scanner at initial position implies also free
>> scanner at the initial position. That should be true, because migration
>> scanner is the first to run. But getting the zone->compact_cached_*_pfn is
>> racy. Worse, zone->compact_cached_migrate_pfn is array distinguishing sync
>> and async compaction, so it's possible that async compaction has advanced
>> both its own migrate scanner cached position, and the shared free scanner
>> cached position, and then sync compaction starts migrate scanner at
>> start_pfn, but free scanner has already advanced.
>
> OK, I see. The whole thing smelled racy but I thought it wouldn't be
> such a big deal. Even if we raced then only a marginal part of the zone
> wouldn't be scanned, right? Or is it possible that free_pfn would appear
> in the middle of the zone because of the race?

The racy part is negligible but I didn't realize the sync/async migrate 
scanner part until now. So yeah, free_pfn could have got to middle of 
zone when it was in the async mode. But that also means that the async 
mode recently used up all free pages in the second half of the zone. WRT 
free pages isolation, async mode is not trying less than sync, so it 
shouldn't be a considerable missed opportunity if we don't rescan the 
it, though.

>> So you might still see a false positive COMPACT_COMPLETE, just less
>> frequently and probably with much lower impact.
>> But if you need to be truly reliable, check also that cc->free_pfn ==
>> round_down(end_pfn - 1, pageblock_nr_pages)
>
> I do not think we need the precise check if the race window (in the
> skipped zone range) is always small.
>
> Thanks!
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ