[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <17a70bb4-5f61-d462-d722-cef8e1010351@suse.cz>
Date: Thu, 18 Feb 2021 18:42:45 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org,
Andrea Arcangeli <aarcange@...hat.com>,
David Hildenbrand <david@...hat.com>,
Michal Hocko <mhocko@...nel.org>,
Mike Rapoport <rppt@...nel.org>, stable@...r.kernel.org,
Qian Cai <cai@....pw>, David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH] mm, compaction: make fast_isolate_freepages() stay within
zone
On 2/17/21 6:33 PM, Vlastimil Babka wrote:
> Compaction always operates on pages from a single given zone when isolating
> both pages to migrate and freepages. Pageblock boundaries are intersected with
> zone boundaries to be safe in case zone starts or ends in the middle of
> pageblock. The use of pageblock_pfn_to_page() protects against non-contiguous
> pageblocks.
>
> The functions fast_isolate_freepages() and fast_isolate_around() don't
> currently protect the fast freepage isolation thoroughly enough against these
> corner cases, and can result in freepage isolation operate outside of zone
> boundaries:
>
> - in fast_isolate_freepages() if we get a pfn from the first pageblock of a
> zone that starts in the middle of that pageblock, 'highest' can be a pfn
> outside of the zone. If we fail to isolate anything in this function, we
> may then call fast_isolate_around() on a pfn outside of the zone and there
> effectively do a set_pageblock_skip(page_to_pfn(highest)) which may currently
> hit a VM_BUG_ON() in some configurations
> - fast_isolate_around() checks only the zone end boundary and not beginning,
> nor that the pageblock is contiguous (with pageblock_pfn_to_page()) so it's
> possible that we end up calling isolate_freepages_block() on a range of pfn's
> from two different zones and end up e.g. isolating freepages under the wrong
> zone's lock.
>
> This patch should fix the above issues.
Sorry, totally forgot these:
Reported-by: Qian Cai <cai@....pw>
Reported-by: Andrea Arcangeli <aarcange@...hat.com>
> Fixes: 5a811889de10 ("mm, compaction: use free lists to quickly locate a migration target")
> Cc: <stable@...r.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
Also thanks David and Mel for the acks!
Thanks to Mike I was able to boot v5.11 in qemu with memmap containing a type 20
hole as Andrea reported, but can't reproduce the bug so far (i.e. without this
patch, with DEBUG_VM enabled) using transhuge-stress; might need some more
nuanced workload...
Powered by blists - more mailing lists