lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8c5e818-536a-4d72-b8dc-36aeb1b61800@arm.com>
Date: Thu, 28 Aug 2025 16:18:48 +0530
From: Dev Jain <dev.jain@....com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>,
 David Hildenbrand <david@...hat.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Nico Pache <npache@...hat.com>, linux-mm@...ck.org,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org, ziy@...dia.com, Liam.Howlett@...cle.com,
 ryan.roberts@....com, corbet@....net, rostedt@...dmis.org,
 mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
 akpm@...ux-foundation.org, baohua@...nel.org, willy@...radead.org,
 peterx@...hat.com, wangkefeng.wang@...wei.com, usamaarif642@...il.com,
 sunnanyong@...wei.com, vishal.moola@...il.com,
 thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
 kirill.shutemov@...ux.intel.com, aarcange@...hat.com, raquini@...hat.com,
 anshuman.khandual@....com, catalin.marinas@....com, tiwai@...e.de,
 will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
 jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
 hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
 rdunlap@...radead.org, hughd@...gle.com
Subject: Re: [PATCH v10 00/13] khugepaged: mTHP support


On 28/08/25 3:16 pm, Baolin Wang wrote:
> (Sorry for chiming in late)
>
> On 2025/8/22 22:10, David Hildenbrand wrote:
>>>> Once could also easily support the value 255 (HPAGE_PMD_NR / 2- 1), 
>>>> but not sure
>>>> if we have to add that for now.
>>>
>>> Yeah not so sure about this, this is a 'just have to know' too, and 
>>> yes you
>>> might add it to the docs, but people are going to be mightily 
>>> confused, esp if
>>> it's a calculated value.
>>>
>>> I don't see any other way around having a separate tunable if we 
>>> don't just have
>>> something VERY simple like on/off.
>>
>> Yeah, not advocating that we add support for other values than 0/511, 
>> really.
>>
>>>
>>> Also the mentioned issue sounds like something that needs to be 
>>> fixed elsewhere
>>> honestly in the algorithm used to figure out mTHP ranges (I may be 
>>> wrong - and
>>> happy to stand corrected if this is somehow inherent, but reallly 
>>> feels that
>>> way).
>>
>> I think the creep is unavoidable for certain values.
>>
>> If you have the first two pages of a PMD area populated, and you 
>> allow for at least half of the #PTEs to be non/zero, you'd collapse 
>> first a
>> order-2 folio, then and order-3 ... until you reached PMD order.
>>
>> So for now we really should just support 0 / 511 to say "don't 
>> collapse if there are holes" vs. "always collapse if there is at 
>> least one pte used".
>
> If we only allow setting 0 or 511, as Nico mentioned before, "At 511, 
> no mTHP collapses would ever occur anyway, unless you have 2MB 
> disabled and other mTHP sizes enabled. Technically, at 511, only the 
> highest enabled order would ever be collapsed."
I didn't understand this statement. At 511, mTHP collapses will occur if 
khugepaged cannot get a PMD folio. Our goal is to collapse to the 
highest order folio.
>
> In other words, for the scenario you described, although there are 
> only 2 PTEs present in a PMD, it would still get collapsed into a 
> PMD-sized THP. In reality, what we probably need is just an order-2 
> mTHP collapse.
>
> If 'khugepaged_max_ptes_none' is set to 255, I think this would 
> achieve the desired result: when there are only 2 PTEs present in a 
> PMD, an order-2 mTHP collapse would be successed, but it wouldn’t 
> creep up to an order-3 mTHP collapse. That’s because:
> When attempting an order-3 mTHP collapse, 'threshold_bits' = 1, while 
> 'bits_set' = 1 (means only 1 chunk is present), so 'bits_set > 
> threshold_bits' is false, then an order-3 mTHP collapse wouldn’t be 
> attempted. No?
>
> So I have some concerns that if we only allow setting 0 or 511, it may 
> not meet the goal we have for mTHP collapsing.
>
>>>> Because, as raised in the past, I'm afraid nobody on this earth has 
>>>> a clue how
>>>> to set this parameter to values different to 0 (don't waste memory 
>>>> with khugepaged)
>>>> and 511 (page fault behavior).
>>>
>>> Yup
>>>
>>>>
>>>>
>>>> If any other value is set, essentially
>>>>     pr_warn("Unsupported 'max_ptes_none' value for mTHP collapse");
>>>>
>>>> for now and just disable it.
>>>
>>> Hmm but under what circumstances? I would just say unsupported value 
>>> not mention
>>> mTHP or people who don't use mTHP might find that confusing.
>>
>> Well, we can check whether any mTHP size is enabled while the value 
>> is set to something unexpected. We can then even print the 
>> problematic sizes if we have to.
>>
>> We could also just just say that if the value is set to something 
>> else than 511 (which is the default), it will be treated as being "0" 
>> when collapsing mthp, instead of doing any scaling.
>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ