[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3aa8d400-fa6a-48bd-b9f2-3bd6f37e523d@suse.cz>
Date: Thu, 22 Jan 2026 09:16:04 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Harry Yoo <harry.yoo@...cle.com>
Cc: Petr Tesarik <ptesarik@...e.com>, Christoph Lameter <cl@...two.org>,
David Rientjes <rientjes@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>, Hao Li <hao.li@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Uladzislau Rezki <urezki@...il.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
bpf@...r.kernel.org, kasan-dev@...glegroups.com
Subject: Re: [PATCH v3 14/21] slab: simplify kmalloc_nolock()
On 1/22/26 02:53, Harry Yoo wrote:
> On Fri, Jan 16, 2026 at 03:40:34PM +0100, Vlastimil Babka wrote:
>> The kmalloc_nolock() implementation has several complications and
>> restrictions due to SLUB's cpu slab locking, lockless fastpath and
>> PREEMPT_RT differences. With cpu slab usage removed, we can simplify
>> things:
>>
>> - relax the PREEMPT_RT context checks as they were before commit
>> a4ae75d1b6a2 ("slab: fix kmalloc_nolock() context check for
>> PREEMPT_RT") and also reference the explanation comment in the page
>> allocator
>>
>> - the local_lock_cpu_slab() macros became unused, remove them
>>
>> - we no longer need to set up lockdep classes on PREEMPT_RT
>>
>> - we no longer need to annotate ___slab_alloc as NOKPROBE_SYMBOL
>> since there's no lockless cpu freelist manipulation anymore
>>
>> - __slab_alloc_node() can be called from kmalloc_nolock_noprof()
>> unconditionally. It can also no longer return EBUSY. But trylock
>> failures can still happen so retry with the larger bucket if the
>> allocation fails for any reason.
>>
>> Note that we still need __CMPXCHG_DOUBLE, because while it was removed
>> we don't use cmpxchg16b on cpu freelist anymore, we still use it on
>> slab freelist, and the alternative is slab_lock() which can be
>> interrupted by a nmi. Clarify the comment to mention it specifically.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>> ---
>
> What a nice cleanup!
>
> Looks good to me,
> Reviewed-by: Harry Yoo <harry.yoo@...cle.com>
Thanks!
> with a nit below.
>
>> mm/slab.h | 1 -
>> mm/slub.c | 144 +++++++++++++-------------------------------------------------
>> 2 files changed, 29 insertions(+), 116 deletions(-)
>>
>> diff --git a/mm/slab.h b/mm/slab.h
>> index 4efec41b6445..e9a0738133ed 100644
>> --- a/mm/slab.h
>> +++ b/mm/slab.h
>> @@ -5268,10 +5196,11 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
>> if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
>> /*
>> * kmalloc_nolock() is not supported on architectures that
>> - * don't implement cmpxchg16b, but debug caches don't use
>> - * per-cpu slab and per-cpu partial slabs. They rely on
>> - * kmem_cache_node->list_lock, so kmalloc_nolock() can
>> - * attempt to allocate from debug caches by
>> + * don't implement cmpxchg16b and thus need slab_lock()
>> + * which could be preempted by a nmi.
>
> nit: I think now this limitation can be removed because the only slab
> lock used in the allocation path is get_partial_node() ->
> __slab_update_freelist(), but it is always used under n->list_lock.
>
> Being preempted by a NMI while holding the slab lock is fine because
> NMI context should fail to acquire n->list_lock and bail out.
Hmm but somebody might be freeing with __slab_free() without taking the
n->list_lock (slab is on partial list and expected to remain there after the
free), then there's a NMI and the allocation can take n->list_lock fine?
> But no hurry on this, it's probably not important enough to delay
> this series :)
>
>> + * But debug caches don't use that and only rely on
>> + * kmem_cache_node->list_lock, so kmalloc_nolock() can attempt
>> + * to allocate from debug caches by
>> * spin_trylock_irqsave(&n->list_lock, ...)
>> */
>> return NULL;
>>
>
Powered by blists - more mailing lists