[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLvu_ijSV+FnrZ6qei_Ty2c9V-RjwLMk24+oOS91bNjaQ@mail.gmail.com>
Date: Mon, 31 Mar 2025 19:42:43 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Waiman Long <llong@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Breno Leitao <leitao@...ian.org>, Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>, aeh@...a.com,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org, jhs@...atatu.com,
kernel-team@...a.com, Erik Lundgren <elundgren@...a.com>,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with expedited
RCU synchronization
On Mon, Mar 31, 2025 at 7:26 PM Boqun Feng <boqun.feng@...il.com> wrote:
>
> On Wed, Mar 26, 2025 at 11:39:49AM -0400, Waiman Long wrote:
> [...]
> > > > Anyway, that may work. The only problem that I see is the issue of nesting
> > > > of an interrupt context on top of a task context. It is possible that the
> > > > first use of a raw_spinlock may happen in an interrupt context. If the
> > > > interrupt happens when the task has set the hazard pointer and iterating the
> > > > hash list, the value of the hazard pointer may be overwritten. Alternatively
> > > > we could have multiple slots for the hazard pointer, but that will make the
> > > > code more complicated. Or we could disable interrupt before setting the
> > > > hazard pointer.
> > > Or we can use lockdep_recursion:
> > >
> > > preempt_disable();
> > > lockdep_recursion_inc();
> > > barrier();
> > >
> > > WRITE_ONCE(*hazptr, ...);
> > >
> > > , it should prevent the re-entrant of lockdep in irq.
> > That will probably work. Or we can disable irq. I am fine with both.
>
> Disabling irq may not work in this case, because an NMI can also happen
> and call register_lock_class().
>
> I'm experimenting a new idea here, it might be better (for general
> cases), and this has the similar spirit that we could move the
> protection scope of a hazard pointer from a key to a hash_list: we can
> introduce a wildcard address, and whenever we do a synchronize_hazptr(),
> if the hazptr slot equal to wildcard, we treat as it matches to any ptr,
> hence synchronize_hazptr() will still wait until it's zero'd. Not only
> this could help in the nesting case, it can also be used if the users
> want to protect multiple things with this simple hazard pointer
> implementation.
For the record this was my suggestion.
Breno, what do you think ? Feel free to grab, test, add comments and a
nice changelog, thanks !
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 67964dc4db952ea11d4b88554383ea0ec5946ef9..fb56bcb887ca163525f003cb7880c76511d166e7
100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -117,7 +117,12 @@ do {
\
} while (0)
extern void lockdep_register_key(struct lock_class_key *key);
-extern void lockdep_unregister_key(struct lock_class_key *key);
+
+extern void __lockdep_unregister_key(struct lock_class_key *key, bool sync);
+static inline void lockdep_unregister_key(struct lock_class_key *key)
+{
+ return __lockdep_unregister_key(key, true);
+}
/*
* These methods are used by specific locking variants (spinlocks,
@@ -372,8 +377,12 @@ static inline void lockdep_register_key(struct
lock_class_key *key)
{
}
+static inline void __lockdep_unregister_key(struct lock_class_key
*key, bool sync)
+{
+}
static inline void lockdep_unregister_key(struct lock_class_key *key)
{
+ return __lockdep_unregister_key(key, true);
}
#define lockdep_depth(tsk) (0)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b15757e6362615aeecb1629f8e60375e13a87e6d..3b754c7fdaf19887734b1ee37f7c058f8f751a4e
100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6574,7 +6574,7 @@ void lockdep_reset_lock(struct lockdep_map *lock)
* key irrespective of debug_locks to avoid potential invalid access to freed
* memory in lock_class entry.
*/
-void lockdep_unregister_key(struct lock_class_key *key)
+void lockdep_unregister_key(struct lock_class_key *key, bool sync)
{
struct hlist_head *hash_head = keyhashentry(key);
struct lock_class_key *k;
@@ -6610,8 +6610,11 @@ void lockdep_unregister_key(struct lock_class_key *key)
if (need_callback)
call_rcu(&delayed_free.rcu_head, free_zapped_rcu);
- /* Wait until is_dynamic_key() has finished accessing k->hash_entry. */
- synchronize_rcu();
+ /* Wait until is_dynamic_key() has finished accessing k->hash_entry.
+ * @sync is false for callers dealing with the sync themselves.
+ */
+ if (sync)
+ synchronize_rcu();
}
EXPORT_SYMBOL_GPL(lockdep_unregister_key);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 14ab2f4c190a1e201dd1788b413a06e799a829f2..9affef187237a13559642d435e1a196a909f75f9
100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -988,8 +988,8 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
return sch;
errout1:
- lockdep_unregister_key(&sch->root_lock_key);
- kfree(sch);
+ __lockdep_unregister_key(&sch->root_lock_key, false);
+ kfree_rcu(sch, rcu);
errout:
return ERR_PTR(err);
}
@@ -1077,7 +1077,7 @@ static void __qdisc_destroy(struct Qdisc *qdisc)
if (ops->destroy)
ops->destroy(qdisc);
- lockdep_unregister_key(&qdisc->root_lock_key);
+ __lockdep_unregister_key(&qdisc->root_lock_key, false);
module_put(ops->owner);
netdev_put(dev, &qdisc->dev_tracker);
Powered by blists - more mailing lists