[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <805fb8a3-6f95-4f20-b5da-87dc3b1e3b60@suse.cz>
Date: Mon, 4 Aug 2025 16:41:42 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Li Qiang <liqiang01@...inos.cn>
Cc: akpm@...ux-foundation.org, david@...hat.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Liam.Howlett@...cle.com, rppt@...nel.org,
surenb@...gle.com, mhocko@...e.com
Subject: Re: [PATCH] mm: memory: Force-inline PTE/PMD zapping functions for
performance
On 8/4/25 15:59, Lorenzo Stoakes wrote:
> OK,
>
> So I hacked -fopt-info-inline-all into the mm/ Makefile in a rather quick and
> dirty way and it seems some stuff gets inlined locally, but we're mostly hitting
> the '--param max-inline-insns-single limit reached' limit here.
>
> Which maybe is just a point where the compiler possibly arbitrarily gives up?
>
> Vlasta rightly pointed out off-list that given this appears to only be used in
> one place you'd expect inlining as register spill isn't such a concern (we'll
> spill saving the stack before function invocation anyway).
>
> So there might actually be some validity here?
>
> This is gcc 15.1.1 running on an x86-64 platform by the way.
>
> mm/memory.c:1871:10: optimized: Inlined zap_p4d_range/6380 into unmap_page_range/6381 which now has time 1458.996712 and size 65, net change of -11.
> mm/memory.c:1850:10: optimized: Inlined zap_pud_range.isra/8017 into zap_p4d_range/6380 which now has time 10725.428482 and size 29, net change of -12.
> mm/memory.c:1829:10: missed: not inlinable: zap_pud_range.isra/8017 -> zap_pmd_range.isra/8018, --param max-inline-insns-single limit reached
> mm/memory.c:1800:10: missed: not inlinable: zap_pmd_range.isra/8018 -> zap_pte_range/6377, --param max-inline-insns-auto limit reached
> mm/memory.c:1708:8: optimized: Inlined do_zap_pte_range.constprop/7983 into zap_pte_range/6377 which now has time 4244.320854 and size 148, net change of -15.
> mm/memory.c:1664:9: missed: not inlinable: do_zap_pte_range.constprop/7983 -> zap_present_ptes.constprop/7985, --param max-inline-insns-single limit reached
I got some weird bloat-o-meter on this patch:
add/remove: 1/0 grow/shrink: 2/2 up/down: 693/-31 (662)
Function old new delta
__handle_mm_fault 3817 4403 +586
do_swap_page 4497 4560 +63
mksaveddirty_shift - 44 +44
unmap_page_range 4843 4828 -15
copy_page_range 6497 6481 -16
but even without this patch, "objdump -t mm/memory.o" shows no zap
functions, so they are already inlined?
gcc also 15.1.1 but maybe opensuse has some non-default tunings.
Powered by blists - more mailing lists