linux-kernel - Re: [PATCH v1] mm/khugepaged: replace page_mapcount() check by folio_likely_mapped

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <18b9acc9-9dc8-4857-83d1-952c94b69e01@nvidia.com>
Date: Wed, 24 Apr 2024 22:40:21 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, linux-doc@...r.kernel.org,
 Andrew Morton <akpm@...ux-foundation.org>, Jonathan Corbet <corbet@....net>,
 "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
 Zi Yan <ziy@...dia.com>, Yang Shi <yang.shi@...ux.alibaba.com>,
 Ryan Roberts <ryan.roberts@....com>
Subject: Re: [PATCH v1] mm/khugepaged: replace page_mapcount() check by
 folio_likely_mapped_shared()

On 4/24/24 9:17 PM, Matthew Wilcox wrote:
> On Wed, Apr 24, 2024 at 09:00:50PM -0700, John Hubbard wrote:
>>> We want to limit the use of page_mapcount() to places where absolutely
>>> required, to prepare for kernel configs where we won't keep track of
>>> per-page mapcounts in large folios.
>>
>>
>> Just curious, can you elaborate on the motivation? I probably missed
>> the discussions that explained why page_mapcount() in large folios
>> is not desirable. Are we getting rid of a field in struct page/folio?
>> Some other reason?
> 
> Two reasons.  One is that, regardless of anything else, folio_mapcount()
> is expensive on large folios as it has to walk every page in the folio
> summing the mapcounts.  The more important reason is that when we move
> to separately allocated folios, we don't want to allocate an array of
> mapcounts in order to maintain a per-page mapcount.
> 
> So we're looking for a more compact scheme to avoid maintaining a
> per-page mapcount.
>

I see. Thanks for explaining the story.

>>> The khugepage MM selftests keep working as expected, including:
>>>
>>> 	Run test: collapse_max_ptes_shared (khugepaged:anon)
>>> 	Allocate huge page... OK
>>> 	Share huge page over fork()... OK
>>> 	Trigger CoW on page 255 of 512... OK
>>> 	Maybe collapse with max_ptes_shared exceeded.... OK
>>> 	Trigger CoW on page 256 of 512... OK
>>> 	Collapse with max_ptes_shared PTEs shared.... OK
>>> 	Check if parent still has huge page... OK
>>
>> Well, a word of caution! These tests do not (yet) cover either of
>> the interesting new cases that folio_likely_mapped_shared() presents:
>> KSM or hugetlbfs interactions. In other words, false positives.
> 
> Hmm ... KSM never uses large folios and hugetlbfs is disjoint from
> khugepaged?
> 

Oh good. I thought we might have had a testing hole, but no.



thanks,
-- 
John Hubbard
NVIDIA