lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <17a70bb4-5f61-d462-d722-cef8e1010351@suse.cz>
Date:   Thu, 18 Feb 2021 18:42:45 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-kernel@...r.kernel.org,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Rapoport <rppt@...nel.org>, stable@...r.kernel.org,
        Qian Cai <cai@....pw>, David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH] mm, compaction: make fast_isolate_freepages() stay within
 zone

On 2/17/21 6:33 PM, Vlastimil Babka wrote:
> Compaction always operates on pages from a single given zone when isolating
> both pages to migrate and freepages. Pageblock boundaries are intersected with
> zone boundaries to be safe in case zone starts or ends in the middle of
> pageblock. The use of pageblock_pfn_to_page() protects against non-contiguous
> pageblocks.
> 
> The functions fast_isolate_freepages() and fast_isolate_around() don't
> currently protect the fast freepage isolation thoroughly enough against these
> corner cases, and can result in freepage isolation operate outside of zone
> boundaries:
> 
> - in fast_isolate_freepages() if we get a pfn from the first pageblock of a
>   zone that starts in the middle of that pageblock, 'highest' can be a pfn
>   outside of the zone. If we fail to isolate anything in this function, we
>   may then call fast_isolate_around() on a pfn outside of the zone and there
>   effectively do a set_pageblock_skip(page_to_pfn(highest)) which may currently
>   hit a VM_BUG_ON() in some configurations
> - fast_isolate_around() checks only the zone end boundary and not beginning,
>   nor that the pageblock is contiguous (with pageblock_pfn_to_page()) so it's
>   possible that we end up calling isolate_freepages_block() on a range of pfn's
>   from two different zones and end up e.g. isolating freepages under the wrong
>   zone's lock.
> 
> This patch should fix the above issues.

Sorry, totally forgot these:

Reported-by: Qian Cai <cai@....pw>
Reported-by: Andrea Arcangeli <aarcange@...hat.com>

> Fixes: 5a811889de10 ("mm, compaction: use free lists to quickly locate a migration target")
> Cc: <stable@...r.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>

Also thanks David and Mel for the acks!

Thanks to Mike I was able to boot v5.11 in qemu with memmap containing a type 20
hole as Andrea reported, but can't reproduce the bug so far (i.e. without this
patch, with DEBUG_VM enabled) using transhuge-stress; might need some more
nuanced workload...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ