linux-kernel - Re: [PATCH] lockdep: Speed up lockdep_unregister

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLvu_ijSV+FnrZ6qei_Ty2c9V-RjwLMk24+oOS91bNjaQ@mail.gmail.com>
Date: Mon, 31 Mar 2025 19:42:43 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Waiman Long <llong@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
	Breno Leitao <leitao@...ian.org>, Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>, aeh@...a.com, 
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org, jhs@...atatu.com, 
	kernel-team@...a.com, Erik Lundgren <elundgren@...a.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited
 RCU synchronization

On Mon, Mar 31, 2025 at 7:26 PM Boqun Feng <boqun.feng@...il.com> wrote:
>
> On Wed, Mar 26, 2025 at 11:39:49AM -0400, Waiman Long wrote:
> [...]
> > > > Anyway, that may work. The only problem that I see is the issue of nesting
> > > > of an interrupt context on top of a task context. It is possible that the
> > > > first use of a raw_spinlock may happen in an interrupt context. If the
> > > > interrupt happens when the task has set the hazard pointer and iterating the
> > > > hash list, the value of the hazard pointer may be overwritten. Alternatively
> > > > we could have multiple slots for the hazard pointer, but that will make the
> > > > code more complicated. Or we could disable interrupt before setting the
> > > > hazard pointer.
> > > Or we can use lockdep_recursion:
> > >
> > >     preempt_disable();
> > >     lockdep_recursion_inc();
> > >     barrier();
> > >
> > >     WRITE_ONCE(*hazptr, ...);
> > >
> > > , it should prevent the re-entrant of lockdep in irq.
> > That will probably work. Or we can disable irq. I am fine with both.
>
> Disabling irq may not work in this case, because an NMI can also happen
> and call register_lock_class().
>
> I'm experimenting a new idea here, it might be better (for general
> cases), and this has the similar spirit that we could move the
> protection scope of a hazard pointer from a key to a hash_list: we can
> introduce a wildcard address, and whenever we do a synchronize_hazptr(),
> if the hazptr slot equal to wildcard, we treat as it matches to any ptr,
> hence synchronize_hazptr() will still wait until it's zero'd. Not only
> this could help in the nesting case, it can also be used if the users
> want to protect multiple things with this simple hazard pointer
> implementation.

For the record this was my suggestion.

Breno, what do you think ? Feel free to grab, test, add comments and a
nice changelog, thanks !

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 67964dc4db952ea11d4b88554383ea0ec5946ef9..fb56bcb887ca163525f003cb7880c76511d166e7
100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -117,7 +117,12 @@ do {
         \
 } while (0)

 extern void lockdep_register_key(struct lock_class_key *key);
-extern void lockdep_unregister_key(struct lock_class_key *key);
+
+extern void __lockdep_unregister_key(struct lock_class_key *key, bool sync);
+static inline void lockdep_unregister_key(struct lock_class_key *key)
+{
+       return __lockdep_unregister_key(key, true);
+}

 /*
  * These methods are used by specific locking variants (spinlocks,
@@ -372,8 +377,12 @@ static inline void lockdep_register_key(struct
lock_class_key *key)
 {
 }

+static inline void __lockdep_unregister_key(struct lock_class_key
*key, bool sync)
+{
+}
 static inline void lockdep_unregister_key(struct lock_class_key *key)
 {
+       return __lockdep_unregister_key(key, true);
 }

 #define lockdep_depth(tsk)     (0)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b15757e6362615aeecb1629f8e60375e13a87e6d..3b754c7fdaf19887734b1ee37f7c058f8f751a4e
100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6574,7 +6574,7 @@ void lockdep_reset_lock(struct lockdep_map *lock)
  * key irrespective of debug_locks to avoid potential invalid access to freed
  * memory in lock_class entry.
  */
-void lockdep_unregister_key(struct lock_class_key *key)
+void lockdep_unregister_key(struct lock_class_key *key, bool sync)
 {
        struct hlist_head *hash_head = keyhashentry(key);
        struct lock_class_key *k;
@@ -6610,8 +6610,11 @@ void lockdep_unregister_key(struct lock_class_key *key)
        if (need_callback)
                call_rcu(&delayed_free.rcu_head, free_zapped_rcu);

-       /* Wait until is_dynamic_key() has finished accessing k->hash_entry. */
-       synchronize_rcu();
+       /* Wait until is_dynamic_key() has finished accessing k->hash_entry.
+        * @sync is false for callers dealing with the sync themselves.
+        */
+       if (sync)
+               synchronize_rcu();
 }
 EXPORT_SYMBOL_GPL(lockdep_unregister_key);

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 14ab2f4c190a1e201dd1788b413a06e799a829f2..9affef187237a13559642d435e1a196a909f75f9
100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -988,8 +988,8 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,

        return sch;
 errout1:
-       lockdep_unregister_key(&sch->root_lock_key);
-       kfree(sch);
+       __lockdep_unregister_key(&sch->root_lock_key, false);
+       kfree_rcu(sch, rcu);
 errout:
        return ERR_PTR(err);
 }
@@ -1077,7 +1077,7 @@ static void __qdisc_destroy(struct Qdisc *qdisc)
        if (ops->destroy)
                ops->destroy(qdisc);

-       lockdep_unregister_key(&qdisc->root_lock_key);
+       __lockdep_unregister_key(&qdisc->root_lock_key, false);
        module_put(ops->owner);
        netdev_put(dev, &qdisc->dev_tracker);