[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5c7aaf9b-a6c0-4670-a244-67948ca86727@linux.alibaba.com>
Date: Thu, 30 Oct 2025 09:15:55 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Nico Pache <npache@...hat.com>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org,
 linux-trace-kernel@...r.kernel.org, linux-mm@...ck.org,
 linux-doc@...r.kernel.org, ziy@...dia.com, Liam.Howlett@...cle.com,
 ryan.roberts@....com, dev.jain@....com, corbet@....net, rostedt@...dmis.org,
 mhiramat@...nel.org, mathieu.desnoyers@...icios.com,
 akpm@...ux-foundation.org, baohua@...nel.org, willy@...radead.org,
 peterx@...hat.com, wangkefeng.wang@...wei.com, usamaarif642@...il.com,
 sunnanyong@...wei.com, vishal.moola@...il.com,
 thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
 kas@...nel.org, aarcange@...hat.com, raquini@...hat.com,
 anshuman.khandual@....com, catalin.marinas@....com, tiwai@...e.de,
 will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
 jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
 hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
 rdunlap@...radead.org, hughd@...gle.com, richard.weiyang@...il.com,
 lance.yang@...ux.dev, vbabka@...e.cz, rppt@...nel.org, jannh@...gle.com,
 pfalcato@...e.de
Subject: Re: [PATCH v12 mm-new 06/15] khugepaged: introduce
 collapse_max_ptes_none helper function
On 2025/10/30 05:14, Nico Pache wrote:
> On Wed, Oct 29, 2025 at 12:56 PM Lorenzo Stoakes
> <lorenzo.stoakes@...cle.com> wrote:
>>
>> On Wed, Oct 29, 2025 at 10:09:43AM +0800, Baolin Wang wrote:
>>> I finally finished reading through the discussions across multiple
>>> threads:), and it looks like we've reached a preliminary consensus (make
>>> 0/511 work). Great and thanks!
>>
>> Yes we're getting there :) it's a sincere effort to try to find a way to move
>> forwards.
>>
>>>
>>> IIUC, the strategy is, configuring it to 511 means always enabling mTHP
>>> collapse, configuring it to 0 means collapsing mTHP only if all PTEs are
>>> non-none/zero, and for other values, we issue a warning and prohibit mTHP
>>> collapse (avoid Lorenzo's concern about silently changing max_ptes_none).
>>> Then the implementation for collapse_max_ptes_none() should be as follows:
>>>
>>> static int collapse_max_ptes_none(unsigned int order, bool full_scan)
>>> {
>>>          /* ignore max_ptes_none limits */
>>>          if (full_scan)
>>>                  return HPAGE_PMD_NR - 1;
>>>
>>>          if (order == HPAGE_PMD_ORDER)
>>>                  return khugepaged_max_ptes_none;
>>>
>>>          /*
>>>           * To prevent creeping towards larger order collapses for mTHP
>>> collapse,
>>>           * we restrict khugepaged_max_ptes_none to only 511 or 0,
>>> simplifying the
>>>           * logic. This means:
>>>           * max_ptes_none == 511 -> collapse mTHP always
>>>           * max_ptes_none == 0 -> collapse mTHP only if we all PTEs are
>>> non-none/zero
>>>           */
>>>          if (!khugepaged_max_ptes_none || khugepaged_max_ptes_none ==
>>> HPAGE_PMD_NR - 1)
>>>                  return khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER -
>>> order);
>>>
>>>          pr_warn_once("mTHP collapse only supports khugepaged_max_ptes_none
>>> configured as 0 or %d\n", HPAGE_PMD_NR - 1);
>>>          return -EINVAL;
>>> }
>>>
>>> So what do you think?
>>
>> Yeah I think something like this.
>>
>> Though I'd implement it more explicitly like:
>>
>>          /* Zero/non-present collapse disabled. */
>>          if (!khugepaged_max_ptes_none)
>>             return 0;
>>
>>          /* Collapse the maximum number of zero/non-present PTEs. */
>>          if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1)
>>                  return (1 << order) - 1;
>>
>> Then we can do away with this confusing (HPAGE_PMD_ORDER - order) stuff.
> 
> This looks cleaner/more explicit given the limits we are enforcing!
> 
> I'll go for something like that.
> 
>>
>> A quick check in google sheets suggests my maths is ok here but do correct me if
>> I'm wrong :)
> 
> LGTM!
LGTM. Thanks.
Powered by blists - more mailing lists
 
