lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6143b8ea-24c0-4446-a0cd-821837f6e74d@gmail.com>
Date: Thu, 31 Jul 2025 20:20:18 +0100
From: Usama Arif <usamaarif642@...il.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, david@...hat.com,
 linux-mm@...ck.org, linux-fsdevel@...r.kernel.org, corbet@....net,
 rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, hannes@...xchg.org,
 baohua@...nel.org, shakeel.butt@...ux.dev, riel@...riel.com, ziy@...dia.com,
 laoar.shao@...il.com, dev.jain@....com, baolin.wang@...ux.alibaba.com,
 npache@...hat.com, Liam.Howlett@...cle.com, ryan.roberts@....com,
 vbabka@...e.cz, jannh@...gle.com, Arnd Bergmann <arnd@...db.de>,
 sj@...nel.org, linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
 kernel-team@...a.com
Subject: Re: [PATCH v2 2/5] mm/huge_memory: convert "tva_flags" to "enum
 tva_type" for thp_vma_allowable_order*()



On 31/07/2025 15:00, Lorenzo Stoakes wrote:
> On Thu, Jul 31, 2025 at 01:27:19PM +0100, Usama Arif wrote:
>> From: David Hildenbrand <david@...hat.com>
>>
>> Describing the context through a type is much clearer, and good enough
>> for our case.
> 
> This is pretty bare bones. What context, what type? Under what
> circumstances?
> 
> This also is missing detail on the key difference here - that actually it
> turns out we _don't_ need these to be flags, rather we can have _distinct_
> modes which are clearer.
> 
> I'd say something like:
> 
> 	when determining which THP orders are eligiible for a VMA mapping,
> 	we have previously specified tva_flags, however it turns out it is
> 	really not necessary to treat these as flags.
> 
> 	Rather, we distinguish between distinct modes.
> 
> 	The only case where we previously combined flags was with
> 	TVA_ENFORCE_SYSFS, but we can avoid this by observing that this is
> 	the default, except for MADV_COLLAPSE or an edge cases in
> 	collapse_pte_mapped_thp() and hugepage_vma_revalidate(), and adding
> 	a mode specifically for this case - TVA_FORCED_COLLAPSE.
> 
> 	... stuff about the different modes...
> 
>>
>> We have:
>> * smaps handling for showing "THPeligible"
>> * Pagefault handling
>> * khugepaged handling
>> * Forced collapse handling: primarily MADV_COLLAPSE, but one other odd case
> 
> Can we actually state what this case is? I mean I guess a handwave in the
> form of 'an edge case in collapse_pte_mapped_thp()' will do also.
> 
> Hmm actually we do weird stuff with this so maybe just handwave.
> 
> Like uprobes calls collapse_pte_mapped_thp()... :/ I'm not sure this 'If we
> are here, we've succeeded in replacing all the native pages in the page
> cache with a single hugepage.' comment is even correct.
> 
> Anyway yeah, hand wave I guess...
> 
>>
>> Really, we want to ignore sysfs only when we are forcing a collapse
>> through MADV_COLLAPSE, otherwise we want to enforce.
> 
> I'd say 'ignoring this edge case, ...'
> 
> I think the clearest thing might be to literally list the before/after
> like:
> 
> * TVA_SMAPS | TVA_ENFORCE_SYSFS -> TVA_SMAPS
> * TVA_IN_PF | TVA_ENFORCE_SYSFS -> TVA_PAGEFAULT
> * TVA_ENFORCE_SYSFS             -> TVA_KHUGEPAGED
> * 0                             -> TVA_FORCED_COLLAPSE
> 
>>
>> With this change, we immediately know if we are in the forced collapse
>> case, which will be valuable next.
>>
>> Signed-off-by: David Hildenbrand <david@...hat.com>
>> Acked-by: Usama Arif <usamaarif642@...il.com>
>> Signed-off-by: Usama Arif <usamaarif642@...il.com>
> 
> Overall this is a great cleanup, some various nits however.
> 

Thanks for the feedback Lorenzo!

I have modified the commit message to be:

    mm/huge_memory: convert "tva_flags" to "enum tva_type"
    
    When determining which THP orders are eligible for a VMA mapping,
    we have previously specified tva_flags, however it turns out it is
    really not necessary to treat these as flags.
    
    Rather, we distinguish between distinct modes.
    
    The only case where we previously combined flags was with
    TVA_ENFORCE_SYSFS, but we can avoid this by observing that this
    is the default, except for MADV_COLLAPSE or an edge cases in
    collapse_pte_mapped_thp() and hugepage_vma_revalidate(), and
    adding a mode specifically for this case - TVA_FORCED_COLLAPSE.
    
    We have:
    * smaps handling for showing "THPeligible"
    * Pagefault handling
    * khugepaged handling
    * Forced collapse handling: primarily MADV_COLLAPSE, but also for
      an edge case in collapse_pte_mapped_thp()
    
    Ignoring the collapse_pte_mapped_thp edgecase, we only want to
    ignore sysfs only when we are forcing a collapse through
    MADV_COLLAPSE, otherwise we want to enforce it, hence this patch
    does the following flag to enum conversions:
    
    * TVA_SMAPS | TVA_ENFORCE_SYSFS -> TVA_SMAPS
    * TVA_IN_PF | TVA_ENFORCE_SYSFS -> TVA_PAGEFAULT
    * TVA_ENFORCE_SYSFS             -> TVA_KHUGEPAGED
    * 0                             -> TVA_FORCED_COLLAPSE
    
    With this change, we immediately know if we are in the forced collapse
    case, which will be valuable next.

>> ---
>>  fs/proc/task_mmu.c      |  4 ++--
>>  include/linux/huge_mm.h | 30 ++++++++++++++++++------------
>>  mm/huge_memory.c        |  8 ++++----
>>  mm/khugepaged.c         | 18 +++++++++---------
>>  mm/memory.c             | 14 ++++++--------
>>  5 files changed, 39 insertions(+), 35 deletions(-)
>>
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index 3d6d8a9f13fc..d440df7b3d59 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
>> @@ -1293,8 +1293,8 @@ static int show_smap(struct seq_file *m, void *v)
>>  	__show_smap(m, &mss, false);
>>
>>  	seq_printf(m, "THPeligible:    %8u\n",
>> -		   !!thp_vma_allowable_orders(vma, vma->vm_flags,
>> -			   TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL));
>> +		   !!thp_vma_allowable_orders(vma, vma->vm_flags, TVA_SMAPS,
>> +					      THP_ORDERS_ALL));
> 
> This !! is so gross, wonder if we could have a bool wrapper. But not a big
> deal.
> 
> I also sort of _hate_ the smaps flag anyway, invoking this 'allowable
> orders' thing just for smaps reporting with maybe some minor delta is just
> odd.
> 
> Something like `bool vma_has_thp_allowed_orders(struct vm_area_struct
> *vma);` would be nicer.
> 
> Anyway thoughts for another time... :)
> 
>>
>>  	if (arch_pkeys_enabled())
>>  		seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 71db243a002e..b0ff54eee81c 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -94,12 +94,15 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr;
>>  #define THP_ORDERS_ALL	\
>>  	(THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_SPECIAL | THP_ORDERS_ALL_FILE_DEFAULT)
>>
>> -#define TVA_SMAPS		(1 << 0)	/* Will be used for procfs */
> 
> Dumb question, but what does 'TVA' stand for? :P
> 
>> -#define TVA_IN_PF		(1 << 1)	/* Page fault handler */
>> -#define TVA_ENFORCE_SYSFS	(1 << 2)	/* Obey sysfs configuration */
>> +enum tva_type {
>> +	TVA_SMAPS,		/* Exposing "THPeligible:" in smaps. */
> 
> How I hate this flag (just an observation...)
> 
>> +	TVA_PAGEFAULT,		/* Serving a page fault. */
>> +	TVA_KHUGEPAGED,		/* Khugepaged collapse. */
> 
> This is equivalent to the TVA_ENFORCE_SYSFS case before, sort of a default
> I guess, but actually quite nice to add the context that it's sourced from
> khugepaged - I assume this will always be the case when specified?
> 
>> +	TVA_FORCED_COLLAPSE,	/* Forced collapse (i.e., MADV_COLLAPSE). */
> 
> Would put 'e.g.' here, then that allows 'space' for the edge case...
> 
>> +};
>>
>> -#define thp_vma_allowable_order(vma, vm_flags, tva_flags, order) \
>> -	(!!thp_vma_allowable_orders(vma, vm_flags, tva_flags, BIT(order)))
>> +#define thp_vma_allowable_order(vma, vm_flags, type, order) \
>> +	(!!thp_vma_allowable_orders(vma, vm_flags, type, BIT(order)))
> 
> Nit, but maybe worth keeping tva_ prefix - tva_type - here just so it's
> clear what type it refers to.
> 
> But not end of the world.
> 
> Same comment goes for param names below etc.
> 
>>
>>  #define split_folio(f) split_folio_to_list(f, NULL)
>>
>> @@ -264,14 +267,14 @@ static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma,
>>
>>  unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  					 vm_flags_t vm_flags,
>> -					 unsigned long tva_flags,
>> +					 enum tva_type type,
>>  					 unsigned long orders);
>>
>>  /**
>>   * thp_vma_allowable_orders - determine hugepage orders that are allowed for vma
>>   * @vma:  the vm area to check
>>   * @vm_flags: use these vm_flags instead of vma->vm_flags
>> - * @tva_flags: Which TVA flags to honour
>> + * @type: TVA type
>>   * @orders: bitfield of all orders to consider
>>   *
>>   * Calculates the intersection of the requested hugepage orders and the allowed
>> @@ -285,11 +288,14 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  static inline
>>  unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  				       vm_flags_t vm_flags,
>> -				       unsigned long tva_flags,
>> +				       enum tva_type type,
>>  				       unsigned long orders)
>>  {
>> -	/* Optimization to check if required orders are enabled early. */
>> -	if ((tva_flags & TVA_ENFORCE_SYSFS) && vma_is_anonymous(vma)) {
>> +	/*
>> +	 * Optimization to check if required orders are enabled early. Only
>> +	 * forced collapse ignores sysfs configs.
>> +	 */
>> +	if (type != TVA_FORCED_COLLAPSE && vma_is_anonymous(vma)) {
>>  		unsigned long mask = READ_ONCE(huge_anon_orders_always);
>>
>>  		if (vm_flags & VM_HUGEPAGE)
>> @@ -303,7 +309,7 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  			return 0;
>>  	}
>>
>> -	return __thp_vma_allowable_orders(vma, vm_flags, tva_flags, orders);
>> +	return __thp_vma_allowable_orders(vma, vm_flags, type, orders);
>>  }
>>
>>  struct thpsize {
>> @@ -536,7 +542,7 @@ static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma,
>>
>>  static inline unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  					vm_flags_t vm_flags,
>> -					unsigned long tva_flags,
>> +					enum tva_type type,
>>  					unsigned long orders)
>>  {
>>  	return 0;
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 2b4ea5a2ce7d..85252b468f80 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -99,12 +99,12 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma)
>>
>>  unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>  					 vm_flags_t vm_flags,
>> -					 unsigned long tva_flags,
>> +					 enum tva_type type,
>>  					 unsigned long orders)
>>  {
>> -	bool smaps = tva_flags & TVA_SMAPS;
>> -	bool in_pf = tva_flags & TVA_IN_PF;
>> -	bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS;
>> +	const bool smaps = type == TVA_SMAPS;
>> +	const bool in_pf = type == TVA_PAGEFAULT;
>> +	const bool enforce_sysfs = type != TVA_FORCED_COLLAPSE;
> 
> Some cheeky const-ifying, I like it :)
> 
>>  	unsigned long supported_orders;
>>
>>  	/* Check the intersection of requested and supported orders. */
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 2c9008246785..7a54b6f2a346 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -474,8 +474,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
>>  {
>>  	if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
>>  	    hugepage_pmd_enabled()) {
>> -		if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS,
>> -					    PMD_ORDER))
>> +		if (thp_vma_allowable_order(vma, vm_flags, TVA_KHUGEPAGED, PMD_ORDER))
>>  			__khugepaged_enter(vma->vm_mm);
>>  	}
>>  }
>> @@ -921,7 +920,8 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
>>  				   struct collapse_control *cc)
>>  {
>>  	struct vm_area_struct *vma;
>> -	unsigned long tva_flags = cc->is_khugepaged ? TVA_ENFORCE_SYSFS : 0;
>> +	enum tva_type tva_type = cc->is_khugepaged ? TVA_KHUGEPAGED :
>> +				 TVA_FORCED_COLLAPSE;
> 
> This is great, this is so much clearer.
> 
> A nit though, I mean I come back to my 'type' vs 'tva_type' nit above, this
> is inconsistent, so we should choose one approach and stick with it.
> 

I dont exactly like the name "tva" (It has nothing to do with the fact it took
me more time than I would like to figure out that it meant THP VMA allowable :)),
so what I will do is use "type" everywhere if that is ok?
But no strong opinion and can change the variable/macro args to tva_type if that
is preferred.

The diff over v2 after taking the review comments into account looks quite trivial:

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index b0ff54eee81c..bd4f9e6327e0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -98,7 +98,7 @@ enum tva_type {
        TVA_SMAPS,              /* Exposing "THPeligible:" in smaps. */
        TVA_PAGEFAULT,          /* Serving a page fault. */
        TVA_KHUGEPAGED,         /* Khugepaged collapse. */
-       TVA_FORCED_COLLAPSE,    /* Forced collapse (i.e., MADV_COLLAPSE). */
+       TVA_FORCED_COLLAPSE,    /* Forced collapse (e.g. MADV_COLLAPSE). */
 };
 
 #define thp_vma_allowable_order(vma, vm_flags, type, order) \
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 7a54b6f2a346..88cb6339e910 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -920,7 +920,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
                                   struct collapse_control *cc)
 {
        struct vm_area_struct *vma;
-       enum tva_type tva_type = cc->is_khugepaged ? TVA_KHUGEPAGED :
+       enum tva_type type = cc->is_khugepaged ? TVA_KHUGEPAGED :
                                 TVA_FORCED_COLLAPSE;
 
        if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
@@ -932,7 +932,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
 
        if (!thp_vma_suitable_order(vma, address, PMD_ORDER))
                return SCAN_ADDRESS_RANGE;
-       if (!thp_vma_allowable_order(vma, vma->vm_flags, tva_type, PMD_ORDER))
+       if (!thp_vma_allowable_order(vma, vma->vm_flags, type, PMD_ORDER))
                return SCAN_VMA_CHECK;
        /*
         * Anon VMA expected, the address may be unmapped then
@@ -1532,8 +1532,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
         * in the page cache with a single hugepage. If a mm were to fault-in
         * this memory (mapped by a suitably aligned VMA), we'd get the hugepage
         * and map it by a PMD, regardless of sysfs THP settings. As such, let's
-        * analogously elide sysfs THP settings here and pretend we are
-        * collapsing.
+        * analogously elide sysfs THP settings here and force collapse.
         */
        if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER))
                return SCAN_VMA_CHECK;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ