lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260123-sheaves-for-all-v4-2-041323d506f7@suse.cz>
Date: Fri, 23 Jan 2026 07:52:40 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Harry Yoo <harry.yoo@...cle.com>, Petr Tesarik <ptesarik@...e.com>, 
 Christoph Lameter <cl@...two.org>, David Rientjes <rientjes@...gle.com>, 
 Roman Gushchin <roman.gushchin@...ux.dev>
Cc: Hao Li <hao.li@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>, 
 Uladzislau Rezki <urezki@...il.com>, 
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
 Suren Baghdasaryan <surenb@...gle.com>, 
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>, 
 Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org, 
 linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev, 
 bpf@...r.kernel.org, kasan-dev@...glegroups.com, 
 Vlastimil Babka <vbabka@...e.cz>, "Paul E. McKenney" <paulmck@...nel.org>
Subject: [PATCH v4 02/22] mm/slab: fix false lockdep warning in
 __kfree_rcu_sheaf()

From: Harry Yoo <harry.yoo@...cle.com>

kvfree_call_rcu() can be called while holding a raw_spinlock_t.
Since __kfree_rcu_sheaf() may acquire a spinlock_t (which becomes a
sleeping lock on PREEMPT_RT) and violate lock nesting rules,
kvfree_call_rcu() bypasses the sheaves layer entirely on PREEMPT_RT.

However, lockdep still complains about acquiring spinlock_t while holding
raw_spinlock_t, even on !PREEMPT_RT where spinlock_t is a spinning lock.
This causes a false lockdep warning [1]:

 =============================
 [ BUG: Invalid wait context ]
 6.19.0-rc6-next-20260120 #21508 Not tainted
 -----------------------------
 migration/1/23 is trying to lock:
 ffff8afd01054e98 (&barn->lock){..-.}-{3:3}, at: barn_get_empty_sheaf+0x1d/0xb0
 other info that might help us debug this:
 context-{5:5}
 3 locks held by migration/1/23:
  #0: ffff8afd01fd89a8 (&p->pi_lock){-.-.}-{2:2}, at: __balance_push_cpu_stop+0x3f/0x200
  #1: ffffffff9f15c5c8 (rcu_read_lock){....}-{1:3}, at: cpuset_cpus_allowed_fallback+0x27/0x250
  #2: ffff8afd1f470be0 ((local_lock_t *)&pcs->lock){+.+.}-{3:3}, at: __kfree_rcu_sheaf+0x52/0x3d0
 stack backtrace:
 CPU: 1 UID: 0 PID: 23 Comm: migration/1 Not tainted 6.19.0-rc6-next-20260120 #21508 PREEMPTLAZY
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
 Stopper: __balance_push_cpu_stop+0x0/0x200 <- balance_push+0x118/0x170
 Call Trace:
  <TASK>
  __dump_stack+0x22/0x30
  dump_stack_lvl+0x60/0x80
  dump_stack+0x19/0x24
  __lock_acquire+0xd3a/0x28e0
  ? __lock_acquire+0x5a9/0x28e0
  ? __lock_acquire+0x5a9/0x28e0
  ? barn_get_empty_sheaf+0x1d/0xb0
  lock_acquire+0xc3/0x270
  ? barn_get_empty_sheaf+0x1d/0xb0
  ? __kfree_rcu_sheaf+0x52/0x3d0
  _raw_spin_lock_irqsave+0x47/0x70
  ? barn_get_empty_sheaf+0x1d/0xb0
  barn_get_empty_sheaf+0x1d/0xb0
  ? __kfree_rcu_sheaf+0x52/0x3d0
  __kfree_rcu_sheaf+0x19f/0x3d0
  kvfree_call_rcu+0xaf/0x390
  set_cpus_allowed_force+0xc8/0xf0
  [...]
  </TASK>

This wasn't triggered until sheaves were enabled for all slab caches,
since kfree_rcu() wasn't being called with a raw spinlock held for
caches with sheaves (vma, maple node).

As suggested by Vlastimil Babka, fix this by using a lockdep map with
LD_WAIT_CONFIG wait type to tell lockdep that acquiring spinlock_t is valid
in this case, as those spinlocks won't be used on PREEMPT_RT.

Note that kfree_rcu_sheaf_map should be acquired using _try() variant,
otherwise the acquisition of the lockdep map itself will trigger an invalid
wait context warning.

Reported-by: Paul E. McKenney <paulmck@...nel.org>
Closes: https://lore.kernel.org/linux-mm/c858b9af-2510-448b-9ab3-058f7b80dd42@paulmck-laptop [1]
Fixes: ec66e0d59952 ("slab: add sheaf support for batching kfree_rcu() operations")
Suggested-by: Vlastimil Babka <vbabka@...e.cz>
Signed-off-by: Harry Yoo <harry.yoo@...cle.com>
Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
---
 mm/slub.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index df71c156d13c..4eb60e99abd7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -6268,11 +6268,26 @@ static void rcu_free_sheaf(struct rcu_head *head)
 	free_empty_sheaf(s, sheaf);
 }
 
+/*
+ * kvfree_call_rcu() can be called while holding a raw_spinlock_t. Since
+ * __kfree_rcu_sheaf() may acquire a spinlock_t (sleeping lock on PREEMPT_RT),
+ * this would violate lock nesting rules. Therefore, kvfree_call_rcu() avoids
+ * this problem by bypassing the sheaves layer entirely on PREEMPT_RT.
+ *
+ * However, lockdep still complains that it is invalid to acquire spinlock_t
+ * while holding raw_spinlock_t, even on !PREEMPT_RT where spinlock_t is a
+ * spinning lock. Tell lockdep that acquiring spinlock_t is valid here
+ * by temporarily raising the wait-type to LD_WAIT_CONFIG.
+ */
+static DEFINE_WAIT_OVERRIDE_MAP(kfree_rcu_sheaf_map, LD_WAIT_CONFIG);
+
 bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
 {
 	struct slub_percpu_sheaves *pcs;
 	struct slab_sheaf *rcu_sheaf;
 
+	lock_map_acquire_try(&kfree_rcu_sheaf_map);
+
 	if (!local_trylock(&s->cpu_sheaves->lock))
 		goto fail;
 
@@ -6349,10 +6364,12 @@ bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
 	local_unlock(&s->cpu_sheaves->lock);
 
 	stat(s, FREE_RCU_SHEAF);
+	lock_map_release(&kfree_rcu_sheaf_map);
 	return true;
 
 fail:
 	stat(s, FREE_RCU_SHEAF_FAIL);
+	lock_map_release(&kfree_rcu_sheaf_map);
 	return false;
 }
 

-- 
2.52.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ