[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <643d3680-9994-ce58-037f-b1fc123ff8bd@suse.cz>
Date: Wed, 14 Aug 2019 09:42:07 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: David Rientjes <rientjes@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Mel Gorman <mgorman@...hsingularity.net>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch] mm, page_alloc: move_freepages should not examine struct
page of reserved memory
On 8/13/19 7:22 PM, David Rientjes wrote:
> On Tue, 13 Aug 2019, Vlastimil Babka wrote:
>
>> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
>> > struct page of reserved memory is zeroed. This causes page->flags to be 0
>> > and fixes issues related to reading /proc/kpageflags, for example, of
>> > reserved memory.
>> >
>> > The VM_BUG_ON() in move_freepages_block(), however, assumes that
>> > page_zone() is meaningful even for reserved memory. That assumption is no
>> > longer true after the aforementioned commit.
>>
>> How comes that move_freepages_block() gets called on reserved memory in
>> the first place?
>>
>
> It's simply math after finding a valid free page from the per-zone free
> area to use as fallback. We find the beginning and end of the pageblock
> of the valid page and that can bring us into memory that was reserved per
> the e820. pfn_valid() is still true (it's backed by a struct page), but
> since it's zero'd we shouldn't make any inferences here about comparing
> its node or zone. The current node check just happens to succeed most of
> the time by luck because reserved memory typically appears on node 0.
>
> The fix here is to validate that we actually have buddy pages before
> testing if there's any type of zone or node strangeness going on.
I see, thanks.
>> > @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *zone,
>> > continue;
>> > }
>> >
>> > + /* Make sure we are not inadvertently changing nodes */
>> > + VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
>> > + VM_BUG_ON_PAGE(page_zone(page) != zone, page);
>>
>> The later check implies the former check, so if it's to stay, the first
>> one could be removed and comment adjusted s/nodes/zones/
>>
>
> Does it? The first is checking for a corrupted page_to_nid the second is
> checking for a corrupted or unexpected page_zone. What's being tested
> here is the state of struct page, as it was previous to this patch, not
> the state of struct zone.
page_zone() calls page_to_nid() internally, so if nid was wrong, the resulting
zone pointer would be also wrong. But if you want more fine grained bug output,
that's fine.
Powered by blists - more mailing lists