[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251024140416.642903-1-clm@meta.com>
Date: Fri, 24 Oct 2025 07:04:12 -0700
From: Chris Mason <clm@...a.com>
To: Vlastimil Babka <vbabka@...e.cz>
CC: Chris Mason <clm@...a.com>, Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...two.org>,
David Rientjes <rientjes@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Harry Yoo <harry.yoo@...cle.com>, Uladzislau Rezki <urezki@...il.com>,
"Liam R. Howlett"
<Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
"Sebastian
Andrzej Siewior" <bigeasy@...utronix.de>,
Alexei Starovoitov
<ast@...nel.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, <linux-rt-devel@...ts.linux.dev>,
<bpf@...r.kernel.org>, <kasan-dev@...glegroups.com>
Subject: Re: [PATCH RFC 07/19] slab: make percpu sheaves compatible with kmalloc_nolock()/kfree_nolock()
On Thu, 23 Oct 2025 15:52:29 +0200 Vlastimil Babka <vbabka@...e.cz> wrote:
> Before we enable percpu sheaves for kmalloc caches, we need to make sure
> kmalloc_nolock() and kfree_nolock() will continue working properly and
> not spin when not allowed to.
>
> Percpu sheaves themselves use local_trylock() so they are already
> compatible. We just need to be careful with the barn->lock spin_lock.
> Pass a new allow_spin parameter where necessary to use
> spin_trylock_irqsave().
>
> In kmalloc_nolock_noprof() we can now attempt alloc_from_pcs() safely,
> for now it will always fail until we enable sheaves for kmalloc caches
> next. Similarly in kfree_nolock() we can attempt free_to_pcs().
>
Hi Vlastimil,
We're trying out the AI patch review automation on the BPF commits and it had
some questions about a few of these. Since the goal is to actually test the
automation, I'm putting them in unedited, but I did try and make sure they
were valid before sending.
This one is a little verbose for s/NULL/ERR_PTR/ suggestions, but:
> diff --git a/mm/slub.c b/mm/slub.c
> index ecb10ed5acfe..5d0b2cf66520 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
[ ... ]
> @@ -2990,7 +2998,8 @@ barn_replace_empty_sheaf(struct node_barn *barn, struct slab_sheaf *empty)
> * barn. But if there are too many full sheaves, reject this with -E2BIG.
> */
> static struct slab_sheaf *
> -barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full)
> +barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full,
> + bool allow_spin)
> {
> struct slab_sheaf *empty;
> unsigned long flags;
> @@ -3001,7 +3010,10 @@ barn_replace_full_sheaf(struct node_barn *barn, struct slab_sheaf *full)
> if (!data_race(barn->nr_empty))
> return ERR_PTR(-ENOMEM);
>
> - spin_lock_irqsave(&barn->lock, flags);
> + if (likely(allow_spin))
> + spin_lock_irqsave(&barn->lock, flags);
> + else if (!spin_trylock_irqsave(&barn->lock, flags))
> + return NULL;
barn_replace_full_sheaf() now returns NULL when the trylock fails, in
addition to its existing ERR_PTR(-ENOMEM) and ERR_PTR(-E2BIG) returns.
[ ... ]
> @@ -6047,7 +6070,8 @@ static void __pcs_install_empty_sheaf(struct kmem_cache *s,
> * unlocked.
> */
> static struct slub_percpu_sheaves *
> -__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
> +__pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
> + bool allow_spin)
> {
> struct slab_sheaf *empty;
> struct node_barn *barn;
> @@ -6071,7 +6095,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
> put_fail = false;
>
> if (!pcs->spare) {
> - empty = barn_get_empty_sheaf(barn);
> + empty = barn_get_empty_sheaf(barn, allow_spin);
> if (empty) {
> pcs->spare = pcs->main;
> pcs->main = empty;
> @@ -6085,7 +6109,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
> return pcs;
> }
>
> - empty = barn_replace_full_sheaf(barn, pcs->main);
> + empty = barn_replace_full_sheaf(barn, pcs->main, allow_spin);
>
> if (!IS_ERR(empty)) {
> stat(s, BARN_PUT);
> pcs->main = empty;
> return pcs;
> }
Can this set pcs->main to NULL? When allow_spin=false and the trylock
fails, barn_replace_full_sheaf() returns NULL. Since !IS_ERR(NULL) is
true, the code executes pcs->main = NULL and returns pcs. This violates
the documented invariant that main is "never NULL when unlocked" (from
struct slub_percpu_sheaves comment at mm/slub.c:481).
The caller free_to_pcs() then dereferences pcs->main->objects at
mm/slub.c:6208, which would be a NULL pointer dereference:
pcs->main->objects[pcs->main->size++] = object;
The execution path is:
1. kfree_nolock() calls free_to_pcs() with allow_spin=false
2. free_to_pcs() calls __pcs_replace_full_main() with allow_spin=false
3. pcs->spare exists and is full (size == s->sheaf_capacity)
4. barn_replace_full_sheaf() trylock fails, returns NULL
5. !IS_ERR(NULL) = true, so pcs->main = NULL is executed
6. __pcs_replace_full_main() returns pcs with NULL main
7. free_to_pcs() dereferences pcs->main->objects causing NULL deref
Should the check be if (empty && !IS_ERR(empty)) to properly handle the
new NULL return case?
> @@ -6093,6 +6117,11 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs)
> return pcs;
> }
>
> + if (!allow_spin) {
> + local_unlock(&s->cpu_sheaves->lock);
> + return NULL;
> + }
This check is never reached when barn_replace_full_sheaf() returns NULL,
because the early return at the !IS_ERR(empty) check above executes first.
[ ... ]
Powered by blists - more mailing lists