lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d8477d9-fdb8-4a85-8978-1c0fc4074158@lucifer.local>
Date: Wed, 18 Jun 2025 11:35:50 +0100
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Yunshui Jiang <jiangyunshui@...inos.cn>
Cc: linux-kernel@...r.kernel.org, linux-mm@...k.org, akpm@...ux-foundation.org,
        david@...hat.com, Liam.Howlett@...cle.com, vbabka@...e.cz
Subject: Re: [PATCH] mm: Inline vma_needs_copy

On Wed, Jun 18, 2025 at 09:42:09AM +0800, Yunshui Jiang wrote:
> From: jiangyunshui <jiangyunshui@...inos.cn>
>
> Since commit bcd51a3c679d ("hugetlb: lazy page table copies
> in fork()"), the logic about judging whether to copy
> page table inside func copy_page_range has been extracted
> into a separate func vma_needs_copy. While this change
> improves code readability, it also incurs more function call
> overhead, especially where fork() were frequently called.
>
> Inline func vma_needs_copy to optimize the copy_page_range
> performance. Given that func vma_needs_copy is only called
> by copy_page_range, inlining it would not cause unacceptable
> code bloat.
>
> Testing was done with the byte-unixbench spawn benchmark
> (which frequently calls fork). I measured 1.7% improvement
> on x86 and 1.8% improvement on arm64.

As per others you are going to need to provide details of your compiler
setup because modern compilers are inlining this already.

if it's ye olde compiler I'm not sure this is justified...

>
> Signed-off-by: jiangyunshui <jiangyunshui@...inos.cn>
> ---
>  mm/memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 8eba595056fe..d15b07f96ab1 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1337,7 +1337,7 @@ copy_p4d_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
>   * false when we can speed up fork() by allowing lazy page faults later until
>   * when the child accesses the memory range.
>   */
> -static bool
> +static __always_inline bool
>  vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
>  {

This needs to live in vma.h probably... todo++...

>  	/*
> --
> 2.47.1
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ