lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cb2f5978-5fdf-4e4d-a662-14c5858b0fe6@linux.alibaba.com>
Date: Thu, 20 Nov 2025 14:37:58 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Nico Pache <npache@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 linux-mm@...ck.org, linux-doc@...r.kernel.org, david@...hat.com,
 ziy@...dia.com, Liam.Howlett@...cle.com, ryan.roberts@....com,
 dev.jain@....com, corbet@....net, rostedt@...dmis.org, mhiramat@...nel.org,
 mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org,
 baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
 wangkefeng.wang@...wei.com, usamaarif642@...il.com, sunnanyong@...wei.com,
 vishal.moola@...il.com, thomas.hellstrom@...ux.intel.com,
 yang@...amperecomputing.com, kas@...nel.org, aarcange@...hat.com,
 raquini@...hat.com, anshuman.khandual@....com, catalin.marinas@....com,
 tiwai@...e.de, will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz,
 cl@...two.org, jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
 hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
 rdunlap@...radead.org, hughd@...gle.com, richard.weiyang@...il.com,
 lance.yang@...ux.dev, vbabka@...e.cz, rppt@...nel.org, jannh@...gle.com,
 pfalcato@...e.de
Subject: Re: [PATCH v12 mm-new 14/15] khugepaged: run khugepaged for all
 orders



On 2025/11/19 20:13, Lorenzo Stoakes wrote:
> On Wed, Oct 22, 2025 at 12:37:16PM -0600, Nico Pache wrote:
>> From: Baolin Wang <baolin.wang@...ux.alibaba.com>
>>
>> If any order (m)THP is enabled we should allow running khugepaged to
>> attempt scanning and collapsing mTHPs. In order for khugepaged to operate
>> when only mTHP sizes are specified in sysfs, we must modify the predicate
>> function that determines whether it ought to run to do so.
>>
>> This function is currently called hugepage_pmd_enabled(), this patch
>> renames it to hugepage_enabled() and updates the logic to check to
>> determine whether any valid orders may exist which would justify
>> khugepaged running.
>>
>> We must also update collapse_allowable_orders() to check all orders if
>> the vma is anonymous and the collapse is khugepaged.
>>
>> After this patch khugepaged mTHP collapse is fully enabled.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
>> Signed-off-by: Nico Pache <npache@...hat.com>
>> ---
>>   mm/khugepaged.c | 25 +++++++++++++------------
>>   1 file changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 54f5c7888e46..8ed9f8e2d376 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -418,23 +418,23 @@ static inline int collapse_test_exit_or_disable(struct mm_struct *mm)
>>   		mm_flags_test(MMF_DISABLE_THP_COMPLETELY, mm);
>>   }
>>
>> -static bool hugepage_pmd_enabled(void)
>> +static bool hugepage_enabled(void)
>>   {
>>   	/*
>>   	 * We cover the anon, shmem and the file-backed case here; file-backed
>>   	 * hugepages, when configured in, are determined by the global control.
>> -	 * Anon pmd-sized hugepages are determined by the pmd-size control.
>> +	 * Anon hugepages are determined by its per-size mTHP control.
>>   	 * Shmem pmd-sized hugepages are also determined by its pmd-size control,
>>   	 * except when the global shmem_huge is set to SHMEM_HUGE_DENY.
>>   	 */
>>   	if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) &&
>>   	    hugepage_global_enabled())
>>   		return true;
>> -	if (test_bit(PMD_ORDER, &huge_anon_orders_always))
>> +	if (READ_ONCE(huge_anon_orders_always))
>>   		return true;
>> -	if (test_bit(PMD_ORDER, &huge_anon_orders_madvise))
>> +	if (READ_ONCE(huge_anon_orders_madvise))
>>   		return true;
>> -	if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) &&
>> +	if (READ_ONCE(huge_anon_orders_inherit) &&
>>   	    hugepage_global_enabled())
>>   		return true;
>>   	if (IS_ENABLED(CONFIG_SHMEM) && shmem_hpage_pmd_enabled())
>> @@ -508,7 +508,8 @@ static unsigned long collapse_allowable_orders(struct vm_area_struct *vma,
>>   			vm_flags_t vm_flags, bool is_khugepaged)
>>   {
>>   	enum tva_type tva_flags = is_khugepaged ? TVA_KHUGEPAGED : TVA_FORCED_COLLAPSE;
>> -	unsigned long orders = BIT(HPAGE_PMD_ORDER);
>> +	unsigned long orders = is_khugepaged && vma_is_anonymous(vma) ?
>> +				THP_ORDERS_ALL_ANON : BIT(HPAGE_PMD_ORDER);
> 
> Why are we doing this? If this is explicitly enabling mTHP for anon, which it
> seems to be, can we please make this a little more explicit :)
> 
> I'd prefer this not to be a horribly squashed ternary, rather:
> 
> 	unsigned long orders;
> 
> 	/* We explicitly allow mTHP collapse for anonymous khugepaged ONLY. */
> 	if (is_khugepaged && vma_is_anonymous(vma))
> 		orders = THP_ORDERS_ALL_ANON;
> 	else
> 		orders = BIT(HPAGE_PMD_ORDER);

Yes, LGTM.

> Also, does THP_ORDERS_ALL_ANON account for KHUGEPAGED_MIN_MTHP_ORDER? It's weird
> to say that an order is allowed that isn't permitted by mTHP (e.g. order-0).

The THP_ORDERS_ALL_ANON has already filtered out order 0 and order 1, so 
it matches the definition of KHUGEPAGED_MIN_MTHP_ORDER.

/*
  * Mask of all large folio orders supported for anonymous THP; all 
orders up to
  * and including PMD_ORDER, except order-0 (which is not "huge") and 
order-1
  * (which is a limitation of the THP implementation).
  */
#define THP_ORDERS_ALL_ANON	((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1)))

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ