lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <016e4d4c-c16b-45d5-a903-681afc6b4203@linux.dev>
Date: Mon, 15 Sep 2025 11:02:14 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Dev Jain <dev.jain@....com>
Cc: ziy@...dia.com, baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
 npache@...hat.com, ryan.roberts@....com, baohua@...nel.org,
 ioworker0@...il.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com, david@...hat.com
Subject: Re: [PATCH mm-new 1/3] mm/khugepaged: skip unsuitable VMAs earlier in
 khugepaged_scan_mm_slot()

Hey Dev,

Thanks for taking time to review!

On 2025/9/15 00:16, Dev Jain wrote:
> 
> On 14/09/25 8:05 pm, Lance Yang wrote:
>> From: Lance Yang <lance.yang@...ux.dev>
>>
>> Let's skip unsuitable VMAs early in the khugepaged scan; specifically,
>> mlocked VMAs should not be touched.
>>
>> Note that the only other user of the VM_NO_KHUGEPAGED mask is
>>   __thp_vma_allowable_orders(), which is also used by the MADV_COLLAPSE
>> path. Since MADV_COLLAPSE has different rules (e.g., for mlocked 
>> VMAs), we
>> cannot simply make the shared mask stricter as that would break it.
>>
>> So, we also introduce a new VM_NO_THP_COLLAPSE mask for that helper,
>> leaving the stricter checks to be applied only within the khugepaged path
>> itself.
>>
>> Signed-off-by: Lance Yang <lance.yang@...ux.dev>
>> ---
>>   include/linux/mm.h |  6 +++++-
>>   mm/huge_memory.c   |  2 +-
>>   mm/khugepaged.c    | 14 +++++++++++++-
>>   3 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index be3e6fb4d0db..cb54d94b2343 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -505,7 +505,11 @@ extern unsigned int kobjsize(const void *objp);
>>   #define VM_REMAP_FLAGS (VM_IO | VM_PFNMAP | VM_DONTEXPAND | 
>> VM_DONTDUMP)
>>   /* This mask prevents VMA from being scanned with khugepaged */
>> -#define VM_NO_KHUGEPAGED (VM_SPECIAL | VM_HUGETLB)
>> +#define VM_NO_KHUGEPAGED \
>> +    (VM_SPECIAL | VM_HUGETLB | VM_LOCKED_MASK | VM_NOHUGEPAGE)
>> +
>> +/* This mask prevents VMA from being collapsed by any THP path */
>> +#define VM_NO_THP_COLLAPSE    (VM_SPECIAL | VM_HUGETLB)
> 
> VM_NO_KHUGEPAGED should then be defined as VM_NO_THP_COLLAPSE | 
> VM_LOCKED_MASK | VM_NOHUGEPAGE.

Yep, it's a good cleanup ;)

> But...
> 
> I believe that the eligibility checking for khugepaged collapse is the 
> business of
> thp_vma_allowable_order(). This functionality should be put there, we 
> literally
> have a TVA_KHUGEPAGED flag :)

Good spot. That's a much better apporach!

My initial thinking was to keep thp_vma_allowable_order() as generic as
possible, avoiding specific checks for individual callers ;)

BUT you are right, the TVA_KHUGEPAGED flag is only passed from the
khugepaged path, so the compiler will optimize out the branch for other
callers, leaving no runtime overhead.

Will rework this patch for v2 as your suggestion!

Thanks,
Lance

> 
>>   /* This mask defines which mm->def_flags a process can inherit its 
>> parent */
>>   #define VM_INIT_DEF_MASK    VM_NOHUGEPAGE
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index d6fc669e11c1..2e91526a037f 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -134,7 +134,7 @@ unsigned long __thp_vma_allowable_orders(struct 
>> vm_area_struct *vma,
>>        * Must be checked after dax since some dax mappings may have
>>        * VM_MIXEDMAP set.
>>        */
>> -    if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED))
>> +    if (!in_pf && !smaps && (vm_flags & VM_NO_THP_COLLAPSE))
>>           return 0;
>>       /*
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 7c5ff1b23e93..e54f99bb0b57 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -345,6 +345,17 @@ struct attribute_group khugepaged_attr_group = {
>>   };
>>   #endif /* CONFIG_SYSFS */
>> +/**
>> + * khugepaged_should_scan_vma - check if a VMA is a candidate for 
>> collapse
>> + * @vm_flags: The flags of the VMA to check.
>> + *
>> + * Returns: true if the VMA should be scanned by khugepaged, false 
>> otherwise.
>> + */
>> +static inline bool khugepaged_should_scan_vma(vm_flags_t vm_flags)
>> +{
>> +    return !(vm_flags & VM_NO_KHUGEPAGED);
>> +}
>> +
>>   int hugepage_madvise(struct vm_area_struct *vma,
>>                vm_flags_t *vm_flags, int advice)
>>   {
>> @@ -2443,7 +2454,8 @@ static unsigned int 
>> khugepaged_scan_mm_slot(unsigned int pages, int *result,
>>               progress++;
>>               break;
>>           }
>> -        if (!thp_vma_allowable_order(vma, vma->vm_flags, 
>> TVA_KHUGEPAGED, PMD_ORDER)) {
>> +        if (!khugepaged_should_scan_vma(vma->vm_flags) ||
>> +            !thp_vma_allowable_order(vma, vma->vm_flags, 
>> TVA_KHUGEPAGED, PMD_ORDER)) {
>>   skip:
>>               progress++;
>>               continue;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ