[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iJW-h0ZUM1oEsPQxjQbe2xnF_+YJZfy6pOHCJu9BkFtwA@mail.gmail.com>
Date: Tue, 20 Jan 2026 09:55:21 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-kernel <linux-kernel@...r.kernel.org>,
Christoph Lameter <cl@...two.org>, David Rientjes <rientjes@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>, Harry Yoo <harry.yoo@...cle.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH] slub: Make sure cache_from_obj() is inlined
On Mon, Jan 19, 2026 at 11:07 PM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 1/15/26 14:06, Eric Dumazet wrote:
> > clang ignores the inline attribute because it thinks cache_from_obj()
> > is too big.
> >
> > Moves the slow path in a separate function (__cache_from_obj())
> > and use __fastpath_inline to please clang and CONFIG_SLUB_TINY configs.
> >
> > This makes kmem_cache_free() and build_detached_freelist()
> > slightly faster.
> >
> > $ size mm/slub.clang.before.o mm/slub.clang.after.o
> > text data bss dec hex filename
> > 77716 7657 4208 89581 15ded mm/slub.clang.before.o
> > 77766 7673 4208 89647 15e2f mm/slub.clang.after.o
> >
> > $ scripts/bloat-o-meter -t mm/slub.clang.before.o mm/slub.clang.after.o
> > Function old new delta
> > __cache_from_obj - 211 +211
> > build_detached_freelist 542 569 +27
> > kmem_cache_free 896 919 +23
> > cache_from_obj 229 - -229
> >
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>
> I assume this is without CONFIG_SLAB_FREELIST_HARDENED. But almost everyone
> uses it today:
> https://oracle.github.io/kconfigs/?config=UTS_RELEASE&config=SLAB_FREELIST_HARDENED
>
> And with that enabled it would likely make things slower due to the extra
> function call to __cache_from_obj(), which does its own virt_to_slab()
> although kmem_cache_free() also does it, etc.
>
Believe it or not, but when CONFIG_SLAB_FREELIST_HARDENED=y,
cache_from_obj() was/is inlined (before and after my patch) by clang :)
> However I'd hope things could be improved differently and for all configs.
> cache_from_obj() is mostly a relict from when memcgs had separate kmem_cache
> instances. It should have been just removed... but hardening repurposed it.
>
> We can however kick it from build_detached_freelist() completely as we're
> not checking every object anyway. And kmem_cache_free() can be rewritten to do
> the checks open-coded and calling a warn function if they fail. If anyone
> cares to harden build_detached_freelist() properly, it could be done
> similarly to this.
>
> How does that look for you wrt performance and bloat-o-meter?
This looks fine to me, thanks !
scripts/bloat-o-meter -t mm/slub.o.old mm/slub.o | grep -v Ltmp
add/remove: 78/78 grow/shrink: 8/1 up/down: 6862/-6443 (419)
Function old new delta
warn_free_bad_obj - 242 +242
kmem_cache_free 896 929 +33
build_detached_freelist 542 531 -11
cache_from_obj 229 - -229
Total: Before=487832, After=488251, chg +0.09%
Powered by blists - more mailing lists