[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f01bde6-fe31-9b0e-f288-06b82598a8b3@redhat.com>
Date: Wed, 25 Nov 2020 12:04:15 +0100
From: David Hildenbrand <david@...hat.com>
To: Mel Gorman <mgorman@...e.de>
Cc: Andrea Arcangeli <aarcange@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Qian Cai <cai@....pw>, Michal Hocko <mhocko@...nel.org>,
linux-kernel@...r.kernel.org, Mike Rapoport <rppt@...ux.ibm.com>,
Baoquan He <bhe@...hat.com>
Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set
pageblock_skip on reserved pages
On 25.11.20 11:39, Mel Gorman wrote:
> On Wed, Nov 25, 2020 at 07:45:30AM +0100, David Hildenbrand wrote:
>>> Something must have changed more recently than v5.1 that caused the
>>> zoneid of reserved pages to be wrong, a possible candidate for the
>>> real would be this change below:
>>>
>>> + __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
>>>
>>
>> Before that change, the memmap of memory holes were only zeroed out. So the zones/nid was 0, however, pages were not reserved and had a refcount of zero - resulting in other issues.
>>
>> Most pfn walkers shouldn???t mess with reserved pages and simply skip them. That would be the right fix here.
>>
>
> Ordinarily yes, pfn walkers should not care about reserved pages but it's
> still surprising that the node/zone linkages would be wrong for memory
> holes. If they are in the middle of a zone, it means that a hole with
> valid struct pages could be mistaken for overlapping nodes (if the hole
> was in node 1 for example) or overlapping zones which is just broken.
I agree within zones - but AFAIU, the issue is reserved memory between
zones, right?
Assume your end of memory falls within a section - what would be the
right node/zone for such a memory hole at the end of the section? With
memory hotplug after such a hole, we can easily have multiple
nodes/zones spanning such a hole, unknown before hotplug.
IMHO, marking memory holes properly (as discussed) would be the cleanest
approach. For now, we use node/zone 0 + PageReserved - because memory
hotunplug (zone shrinking etc.) doesn't really care about ZONE_DMA.
>
>>>
>>> Whenever pfn_valid is true, it's better that the zoneid/nid is correct
>>> all times, otherwise if the second stage fails we end up in a bug with
>>> weird side effects.
>>
>> Memory holes with a valid memmap might not have a zone/nid. For now, skipping reserved pages should be good enough, no?
>>
>
> It would partially paper over the issue that setting the pageblock type
> based on a reserved page. I agree that compaction should not be returning
> pfns that are outside of the zone range because that is buggy in itself
> but valid struct pages should have valid information. I don't think we
> want to paper over that with unnecessary PageReserved checks.
Agreed as long as we can handle that issue using range checks.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists