linux-kernel - Re: [PATCH v3 1/7] mm: khugepaged: extract vm flag setting outside of hugepage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <18be636f-d6a7-4270-b324-22750b3a358c@lucifer.local>
Date: Tue, 20 May 2025 15:43:17 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Usama Arif <usamaarif642@...il.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, david@...hat.com,
        linux-mm@...ck.org, hannes@...xchg.org, shakeel.butt@...ux.dev,
        riel@...riel.com, ziy@...dia.com, laoar.shao@...il.com,
        baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
        npache@...hat.com, ryan.roberts@....com, vbabka@...e.cz,
        jannh@...gle.com, Arnd Bergmann <arnd@...db.de>,
        linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        kernel-team@...a.com
Subject: Re: [PATCH v3 1/7] mm: khugepaged: extract vm flag setting outside
 of hugepage_madvise

This commit message is really poor. You're also not mentioning that you're
changing s390 behaviour?

On Mon, May 19, 2025 at 11:29:53PM +0100, Usama Arif wrote:
> This is so that flag setting can be resused later in other functions,

Typo.

> to reduce code duplication (including the s390 exception).
>
> No functional change intended with this patch.

I'm pretty sure somebody reviewed that this should just be merged with whatever
uses this? I'm not sure this is all that valuable as you're not really changing
this structurally very much.

>
> Signed-off-by: Usama Arif <usamaarif642@...il.com>

Yeah I'm not a fan of this patch, it's buggy and really unclear what the
purpose is here.

> ---
>  include/linux/huge_mm.h |  1 +
>  mm/khugepaged.c         | 26 +++++++++++++++++---------
>  2 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 2f190c90192d..23580a43787c 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -431,6 +431,7 @@ change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  			__split_huge_pud(__vma, __pud, __address);	\
>  	}  while (0)
>
> +int hugepage_set_vmflags(unsigned long *vm_flags, int advice);
>  int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags,
>  		     int advice);
>  int madvise_collapse(struct vm_area_struct *vma,
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index b04b6a770afe..ab3427c87422 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -346,8 +346,7 @@ struct attribute_group khugepaged_attr_group = {
>  };
>  #endif /* CONFIG_SYSFS */
>
> -int hugepage_madvise(struct vm_area_struct *vma,
> -		     unsigned long *vm_flags, int advice)
> +int hugepage_set_vmflags(unsigned long *vm_flags, int advice)


>  {
>  	switch (advice) {
>  	case MADV_HUGEPAGE:
> @@ -358,16 +357,10 @@ int hugepage_madvise(struct vm_area_struct *vma,
>  		 * ignore the madvise to prevent qemu from causing a SIGSEGV.
>  		 */
>  		if (mm_has_pgste(vma->vm_mm))

This is broken, you refer to vma which doesn't exist.

As the kernel bots are telling you...

> -			return 0;
> +			return -EPERM;

Why are you now returning an error?

This seems like a super broken way of making the caller return 0. Just make this
whole thing a bool return if you're going to treat it like a boolean function.

>  #endif
>  		*vm_flags &= ~VM_NOHUGEPAGE;
>  		*vm_flags |= VM_HUGEPAGE;
> -		/*
> -		 * If the vma become good for khugepaged to scan,
> -		 * register it here without waiting a page fault that
> -		 * may not happen any time soon.
> -		 */
> -		khugepaged_enter_vma(vma, *vm_flags);
>  		break;
>  	case MADV_NOHUGEPAGE:
>  		*vm_flags &= ~VM_HUGEPAGE;
> @@ -383,6 +376,21 @@ int hugepage_madvise(struct vm_area_struct *vma,
>  	return 0;
>  }
>
> +int hugepage_madvise(struct vm_area_struct *vma,
> +		     unsigned long *vm_flags, int advice)
> +{
> +	if (advice == MADV_HUGEPAGE && !hugepage_set_vmflags(vm_flags, advice)) {

So now you've completely broken MADV_NOHUGEPAGE haven't you?

> +		/*
> +		 * If the vma become good for khugepaged to scan,
> +		 * register it here without waiting a page fault that
> +		 * may not happen any time soon.
> +		 */
> +		khugepaged_enter_vma(vma, *vm_flags);
> +	}
> +
> +	return 0;
> +}
> +
>  int __init khugepaged_init(void)
>  {
>  	mm_slot_cache = KMEM_CACHE(khugepaged_mm_slot, 0);
> --
> 2.47.1
>