linux-kernel - Re: [PATCH] lockdep: Speed up lockdep_unregister

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <35039448-d8e8-4a7d-b59b-758d81330d4b@paulmck-laptop>
Date: Mon, 31 Mar 2025 11:33:15 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Waiman Long <llong@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>, Eric Dumazet <edumazet@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Breno Leitao <leitao@...ian.org>, Ingo Molnar <mingo@...hat.com>,
	Will Deacon <will@...nel.org>, aeh@...a.com,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	jhs@...atatu.com, kernel-team@...a.com,
	Erik Lundgren <elundgren@...a.com>
Subject: Re: [PATCH] lockdep: Speed up lockdep_unregister_key() with
 expedited RCU synchronization

On Mon, Mar 31, 2025 at 01:33:22PM -0400, Waiman Long wrote:
> On 3/31/25 1:26 PM, Boqun Feng wrote:
> > On Wed, Mar 26, 2025 at 11:39:49AM -0400, Waiman Long wrote:
> > [...]
> > > > > Anyway, that may work. The only problem that I see is the issue of nesting
> > > > > of an interrupt context on top of a task context. It is possible that the
> > > > > first use of a raw_spinlock may happen in an interrupt context. If the
> > > > > interrupt happens when the task has set the hazard pointer and iterating the
> > > > > hash list, the value of the hazard pointer may be overwritten. Alternatively
> > > > > we could have multiple slots for the hazard pointer, but that will make the
> > > > > code more complicated. Or we could disable interrupt before setting the
> > > > > hazard pointer.
> > > > Or we can use lockdep_recursion:
> > > > 
> > > > 	preempt_disable();
> > > > 	lockdep_recursion_inc();
> > > > 	barrier();
> > > > 
> > > > 	WRITE_ONCE(*hazptr, ...);
> > > > 
> > > > , it should prevent the re-entrant of lockdep in irq.
> > > That will probably work. Or we can disable irq. I am fine with both.
> > Disabling irq may not work in this case, because an NMI can also happen
> > and call register_lock_class().
> Right, disabling irq doesn't work with NMI. So incrementing the recursion
> count is likely the way to go and I think it will work even in the NMI case.
> 
> > 
> > I'm experimenting a new idea here, it might be better (for general
> > cases), and this has the similar spirit that we could move the
> > protection scope of a hazard pointer from a key to a hash_list: we can
> > introduce a wildcard address, and whenever we do a synchronize_hazptr(),
> > if the hazptr slot equal to wildcard, we treat as it matches to any ptr,
> > hence synchronize_hazptr() will still wait until it's zero'd. Not only
> > this could help in the nesting case, it can also be used if the users
> > want to protect multiple things with this simple hazard pointer
> > implementation.
> 
> I think it is a good idea to add a wildcard for the general use case.
> Setting the hazptr to the list head will be enough for this particular case.

Careful!  If we enable use of wildcards outside of the special case
of synchronize_hazptr(), we give up the small-memory-footprint advantages
of hazard pointers.  You end up having to wait on all hazard-pointer
readers, which was exactly why RCU was troublesome here.  ;-)

							Thanx, Paul