linux-kernel - Re: [for-next][PATCH 2/2] atomic64: Use arch_spin_locks instead of raw_spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250122105517.4f80bf23@gandalf.local.home>
Date: Wed, 22 Jan 2025 10:55:17 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Masami Hiramatsu <mhiramat@...nel.org>,
 Mark Rutland <mark.rutland@....com>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>, Andrew Morton
 <akpm@...ux-foundation.org>, stable@...r.kernel.org, Thomas Gleixner
 <tglx@...utronix.de>, Linus Torvalds <torvalds@...ux-foundation.org>,
 Andreas Larsson <andreas@...sler.com>, Ludwig Rydberg
 <ludwig.rydberg@...sler.com>
Subject: Re: [for-next][PATCH 2/2] atomic64: Use arch_spin_locks instead of
 raw_spin_locks

On Wed, 22 Jan 2025 11:14:57 +0100
Peter Zijlstra <peterz@...radead.org> wrote:

> On Tue, Jan 21, 2025 at 03:19:44PM -0500, Steven Rostedt wrote:
> > From: Steven Rostedt <rostedt@...dmis.org>
> > 
> > raw_spin_locks can be traced by lockdep or tracing itself. Atomic64
> > operations can be used in the tracing infrastructure. When an architecture
> > does not have true atomic64 operations it can use the generic version that
> > disables interrupts and uses spin_locks.
> > 
> > The tracing ring buffer code uses atomic64 operations for the time
> > keeping. But because some architectures use the default operations, the
> > locking inside the atomic operations can cause an infinite recursion.
> > 
> > As atomic64 is an architecture specific operation, it should not   
> 
> used in generic code :-)

Yes, but the atomic64 implementation is architecture specific. I could
change that to be:

  "As atomic64 implementation is architecture specific, it should not"

> 
> > be using
> > raw_spin_locks() but instead arch_spin_locks as that is the purpose of
> > arch_spin_locks. To be used in architecture specific implementations of
> > generic infrastructure like atomic64 operations.  
> 
> Urgh.. this is horrible. This is why you shouldn't be using atomic64 in
> generic code much :/
> 
> Why not just drop support for those cummy archs? Or drop whatever trace
> feature depends on this.

Can't that would be a regression. Here's the history. As the timestamps of
events are related to each other, as one event only has the delta from the
previous event (yeah, this causes issues, but it was recommended to do it
this way when it was created, and it can't change now). And as the ring
buffer is lockless, it can be preempted by interrupts and NMIs that can
inject their own timestamps, it use to be that an interrupted event would
just have a zero delta. If an interrupt came in while an event was being
written, and it created events, all its events would have the same
timestamp as the event it interrupted.

But this caused issues due to not being able to see timings of events from
interrupts that interrupted an event in progress.

I fixed this, but that required doing a 64 bit cmpxchg on the timestamp
when the race occurred. I originally did not use atomic64, and instead for
32bit architectures, it used a "special" timestamp that was broken into
multiple 32bit words, and there was special logic to try to keep them in
sync when this occurred. But that started becoming too complex with some
corner cases, so I decided to simply let these 32 bit architectures us
atomic64. That worked fine for architectures that have 64 bit atomics and
do not rely on spinlocks.

Then I started getting reports of the tracing system causing deadlocks.
That is, because raw_spin_lock() is traced. And it should be, as locks do
cause issues and tracing them can help debug those issues. Lockdep and
tracing both use arch_spin_lock() so that it doesn't recurse into itself.
Even RCU uses it. So I don't see why there would be any issue with the
atomic64 implementation using it as it is an even more basic operation than
RCU is.

> 
> 
> >  s64 generic_atomic64_read(const atomic64_t *v)
> >  {
> >  	unsigned long flags;
> > -	raw_spinlock_t *lock = lock_addr(v);
> > +	arch_spinlock_t *lock = lock_addr(v);
> >  	s64 val;
> >  
> > -	raw_spin_lock_irqsave(lock, flags);
> > +	local_irq_save(flags);
> > +	arch_spin_lock(lock);  
> 
> Note that this is not an equivalent change. It's probably sufficient,
> but at the very least the Changelog should call out what went missing
> and how that is okay.

What exactly is the difference here that you are talking about? I know that
raw_spin_lock_irqsave() has lots of different variants depending on the
config options, but I'm not sure which you are talking about? Is it the fact
that you can't do the different variants with this?

Or is it because it's not checked by lockdep? Hmm, I mentioned that in the
cover letter, but I failed to mention it here in this change log. I can
definitely add that, if that's what you are referring to.

-- Steve