[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9dcd181a-7058-4fee-83a5-695df77c9edb@arm.com>
Date: Tue, 4 Nov 2025 16:05:46 +0530
From: Dev Jain <dev.jain@....com>
To: Vlastimil Babka <vbabka@...e.cz>, Harry Yoo <harry.yoo@...cle.com>,
Alexei Starovoitov <ast@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...two.org>, David Rientjes <rientjes@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] slab: prevent infinite loop in kmalloc_nolock() with
debugging
On 04/11/25 3:54 pm, Vlastimil Babka wrote:
> On 11/4/25 6:26 AM, Dev Jain wrote:
>> On 03/11/25 5:54 pm, Vlastimil Babka wrote:
>>> In review of a followup work, Harry noticed a potential infinite loop.
>>> Upon closed inspection, it already exists for kmalloc_nolock() on a
>>> cache with debugging enabled, since commit af92793e52c3 ("slab:
>>> Introduce kmalloc_nolock() and kfree_nolock().")
>>>
>>> When alloc_single_from_new_slab() fails to trylock node list_lock, we
>>> keep retrying to get partial slab or allocate a new slab. If we indeed
>>> interrupted somebody holding the list_lock, the trylock fill fail
>> Hi Vlastimil,
>>
>> I see that we always take n->list_lock spinlock by disabling irqs. So
>> how can we interrupt someone holding the list_lock?
> From a NMI or e.g. a kprobe->bpf hook, which are the use cases for
> kmalloc_nolock(). The word "interrupt" thus doesn't mean IRQ, but I'm
> not sure which word would be better. "Preempt" would be perhaps even
> more potentially misleading.
>
>> If we are already in a path holding list_lock, and trigger a slab
>> allocation
>> and recursively end up in the same path again, we can get the situation
>> you mention, is that possible?
> There shouldn't be such recursion in the code itself, in the absence of
> NMI/kprobe/etc.
Thanks for explaining.
>>> deterministically and we end up allocating and defer-freeing slabs
>>> indefinitely with no progress.
>>>
>>> To fix it, fail the allocation if spinning is not allowed. This is
>>> acceptable in the restricted context of kmalloc_nolock(), especially
>>> with debugging enabled.
>>>
>>> Reported-by: Harry Yoo <harry.yoo@...cle.com>
>>> Closes: https://lore.kernel.org/all/aQLqZjjq1SPD3Fml@hyeyoo/
>>> Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and
>>> kfree_nolock().")
>>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>>> ---
>>> as we discussed in the linked thread, 6.18 hotfix to be included in
>>> slab/for-next-fixes
>>> ---
>>> mm/slub.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/slub.c b/mm/slub.c
>>> index d4367f25b20d..f1a5373eee7b 100644
>>> --- a/mm/slub.c
>>> +++ b/mm/slub.c
>>> @@ -4666,8 +4666,12 @@ static void *___slab_alloc(struct kmem_cache
>>> *s, gfp_t gfpflags, int node,
>>> if (kmem_cache_debug(s)) {
>>> freelist = alloc_single_from_new_slab(s, slab, orig_size,
>>> gfpflags);
>>> - if (unlikely(!freelist))
>>> + if (unlikely(!freelist)) {
>>> + /* This could cause an endless loop. Fail instead. */
>>> + if (!allow_spin)
>>> + return NULL;
>>> goto new_objects;
>>> + }
>>> if (s->flags & SLAB_STORE_USER)
>>> set_track(s, freelist, TRACK_ALLOC, addr,
>>>
>>> ---
>>> base-commit: 6146a0f1dfae5d37442a9ddcba012add260bceb0
>>> change-id: 20251103-fix-nolock-loop-854e0101672f
>>>
>>> Best regards,
Powered by blists - more mailing lists