lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <805fb8a3-6f95-4f20-b5da-87dc3b1e3b60@suse.cz>
Date: Mon, 4 Aug 2025 16:41:42 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Li Qiang <liqiang01@...inos.cn>
Cc: akpm@...ux-foundation.org, david@...hat.com, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, Liam.Howlett@...cle.com, rppt@...nel.org,
 surenb@...gle.com, mhocko@...e.com
Subject: Re: [PATCH] mm: memory: Force-inline PTE/PMD zapping functions for
 performance

On 8/4/25 15:59, Lorenzo Stoakes wrote:
> OK,
> 
> So I hacked -fopt-info-inline-all into the mm/ Makefile in a rather quick and
> dirty way and it seems some stuff gets inlined locally, but we're mostly hitting
> the '--param max-inline-insns-single limit reached' limit here.
> 
> Which maybe is just a point where the compiler possibly arbitrarily gives up?
> 
> Vlasta rightly pointed out off-list that given this appears to only be used in
> one place you'd expect inlining as register spill isn't such a concern (we'll
> spill saving the stack before function invocation anyway).
> 
> So there might actually be some validity here?
> 
> This is gcc 15.1.1 running on an x86-64 platform by the way.
> 
> mm/memory.c:1871:10: optimized:  Inlined zap_p4d_range/6380 into unmap_page_range/6381 which now has time 1458.996712 and size 65, net change of -11.
> mm/memory.c:1850:10: optimized:  Inlined zap_pud_range.isra/8017 into zap_p4d_range/6380 which now has time 10725.428482 and size 29, net change of -12.
> mm/memory.c:1829:10: missed:   not inlinable: zap_pud_range.isra/8017 -> zap_pmd_range.isra/8018, --param max-inline-insns-single limit reached
> mm/memory.c:1800:10: missed:   not inlinable: zap_pmd_range.isra/8018 -> zap_pte_range/6377, --param max-inline-insns-auto limit reached
> mm/memory.c:1708:8: optimized:  Inlined do_zap_pte_range.constprop/7983 into zap_pte_range/6377 which now has time 4244.320854 and size 148, net change of -15.
> mm/memory.c:1664:9: missed:   not inlinable: do_zap_pte_range.constprop/7983 -> zap_present_ptes.constprop/7985, --param max-inline-insns-single limit reached

I got some weird bloat-o-meter on this patch:

add/remove: 1/0 grow/shrink: 2/2 up/down: 693/-31 (662)
Function                                     old     new   delta
__handle_mm_fault                           3817    4403    +586
do_swap_page                                4497    4560     +63
mksaveddirty_shift                             -      44     +44
unmap_page_range                            4843    4828     -15
copy_page_range                             6497    6481     -16

but even without this patch, "objdump -t mm/memory.o" shows no zap
functions, so they are already inlined?

gcc also 15.1.1 but maybe opensuse has some non-default tunings.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ