[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01cf95d7-4e38-43c6-80ef-c990f66f1e26@suse.cz>
Date: Sat, 10 Jan 2026 16:41:14 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Chris Mason <clm@...a.com>, Roman Gushchin <roman.gushchin@...ux.dev>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...two.org>, David Rientjes <rientjes@...gle.com>,
Harry Yoo <harry.yoo@...cle.com>, Uladzislau Rezki <urezki@...il.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
bpf@...r.kernel.org, kasan-dev@...glegroups.com,
Petr Tesarik <ptesarik@...e.com>, "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH RFC 10/19] slab: remove cpu (partial) slabs usage from
allocation paths
On 1/10/26 14:20, Chris Mason wrote:
> On 1/9/26 3:16 AM, Vlastimil Babka wrote:
>> On 10/24/25 16:29, Chris Mason wrote:
>>> On Thu, 23 Oct 2025 15:52:32 +0200 Vlastimil Babka <vbabka@...e.cz> wrote:
>
> [ ... ]
>
>> By the way, there was another bug in this patch, causing a severe memory
>> leak, which the AI unfortunately didn't flag. Petr reported it during
>> performance testing and it took me more than a day to find it. Oh well :)
>>
>> Wonder if things got better since then perhaps, and your or Roman's tools
>> would find it today? :)
>
> Yes and no. It didn't find the leak until I changed the prompt to say:
> "there is a leak, find it". I'll see if I can improve things...
Thanks. Hmm even if it has to be done like this, it could be a substantial
time saver vs finding the leak myself.
>> diff --git a/mm/slub.c b/mm/slub.c
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>
> [ ... ]
>
>> @@ -3592,54 +3549,53 @@ static void *get_partial_node(struct kmem_cache *s,
>> else if (!spin_trylock_irqsave(&n->list_lock, flags))
>> return NULL;
>> list_for_each_entry_safe(slab, slab2, &n->partial, slab_list) {
>> +
>> + struct freelist_counters old, new;
>> +
>> if (!pfmemalloc_match(slab, pc->flags))
>> continue;
>>
>> if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) {
>> - void *object = alloc_single_from_partial(s, n, slab,
>> + object = alloc_single_from_partial(s, n, slab,
>> pc->orig_size);
>> - if (object) {
>> - partial = slab;
>> - pc->object = object;
>> + if (object)
>> break;
>> - }
>> continue;
>> }
>>
>> + /*
>> + * get a single object from the slab. This might race against
>> + * __slab_free(), which however has to take the list_lock if
>> + * it's about to make the slab fully free.
>> + */
>> + do {
>> + old.freelist = slab->freelist;
>> + old.counters = slab->counters;
>> +
>> + new.freelist = get_freepointer(s, old.freelist);
>> + new.counters = old.counters;
>> + new.inuse++;
>> +
>> + } while (!__slab_update_freelist(s, slab, &old, &new, "get_partial_node"));
>> +
>> + object = old.freelist;
>> + if (!new.freelist)
>> + remove_partial(n, slab);
>
> Is there a missing break statement here? The debug path above breaks out
> of the loop after successfully allocating an object, but this non-debug
> path continues iterating through the partial list. Each iteration overwrites
> the object variable, so previously allocated objects would be leaked.
>
> The commit message says "Now we only want to return a single object" which
> matches the debug path behavior, but the non-debug path appears to allocate
> from every matching slab in the list.
>
>> }
>> spin_unlock_irqrestore(&n->list_lock, flags);
>> - return partial;
>> + return object;
>> }
>
Powered by blists - more mailing lists