linux-kernel - Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set pageblock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cef8f30b-8b11-77df-4a12-e35715332778@redhat.com>
Date:   Wed, 25 Nov 2020 14:41:10 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Mel Gorman <mgorman@...e.de>
Cc:     Andrea Arcangeli <aarcange@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        Qian Cai <cai@....pw>, Michal Hocko <mhocko@...nel.org>,
        linux-kernel@...r.kernel.org, Mike Rapoport <rppt@...ux.ibm.com>,
        Baoquan He <bhe@...hat.com>
Subject: Re: [PATCH 1/1] mm: compaction: avoid fast_isolate_around() to set
 pageblock_skip on reserved pages

On 25.11.20 14:33, Mel Gorman wrote:
> On Wed, Nov 25, 2020 at 12:04:15PM +0100, David Hildenbrand wrote:
>> On 25.11.20 11:39, Mel Gorman wrote:
>>> On Wed, Nov 25, 2020 at 07:45:30AM +0100, David Hildenbrand wrote:
>>>>> Something must have changed more recently than v5.1 that caused the
>>>>> zoneid of reserved pages to be wrong, a possible candidate for the
>>>>> real would be this change below:
>>>>>
>>>>> +               __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
>>>>>
>>>>
>>>> Before that change, the memmap of memory holes were only zeroed out. So the zones/nid was 0, however, pages were not reserved and had a refcount of zero - resulting in other issues.
>>>>
>>>> Most pfn walkers shouldn???t mess with reserved pages and simply skip them. That would be the right fix here.
>>>>
>>>
>>> Ordinarily yes, pfn walkers should not care about reserved pages but it's
>>> still surprising that the node/zone linkages would be wrong for memory
>>> holes. If they are in the middle of a zone, it means that a hole with
>>> valid struct pages could be mistaken for overlapping nodes (if the hole
>>> was in node 1 for example) or overlapping zones which is just broken.
>>
>> I agree within zones - but AFAIU, the issue is reserved memory between
>> zones, right?
>>
> 
> It can also occur in the middle of the zone.
> 
>> Assume your end of memory falls within a section - what would be the
>> right node/zone for such a memory hole at the end of the section?
> 
> Assuming a hole is not MAX_ORDER-aligned but there is real memory within
> the page block, then the node/zone for the struct pages backing the hole
> should match the real memorys node and zone.
> 
> As it stands, with the uninitialised node/zone, certain checks like
> page_is_buddy(): page_zone_id(page) != page_zone_id(buddy) may only
> work by co-incidence. page_is_buddy() happens to work anyway because
> PageBuddy(buddy) would never be true for a PageReserved page.
> 
>> With
>> memory hotplug after such a hole, we can easily have multiple
>> nodes/zones spanning such a hole, unknown before hotplug.
>>
> 
> When hotplugged, the same logic would apply. Where the hole is not aligned,
> the struct page linkages should match the "real" memory".
> 
>>> It would partially paper over the issue that setting the pageblock type
>>> based on a reserved page. I agree that compaction should not be returning
>>> pfns that are outside of the zone range because that is buggy in itself
>>> but valid struct pages should have valid information. I don't think we
>>> want to paper over that with unnecessary PageReserved checks.
>>
>> Agreed as long as we can handle that issue using range checks.
>>
> 
> I think it'll be ok as long as the struct pages within a 1<<(MAX_ORDER-1)
> range have proper linkages.

Agreed.


-- 
Thanks,

David / dhildenb