lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 2 May 2024 17:16:43 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Al Viro <viro@...iv.linux.org.uk>,
	John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
	elver@...gle.com, akpm@...ux-foundation.org, tglx@...utronix.de,
	peterz@...radead.org, dianders@...omium.org, pmladek@...e.com,
	arnd@...db.de, kernel-team@...a.com,
	Andi Shyti <andi.shyti@...ux.intel.com>,
	Palmer Dabbelt <palmer@...osinc.com>,
	Masami Hiramatsu <mhiramat@...nel.org>, linux-sh@...r.kernel.org
Subject: Re: [PATCH v2 cmpxchg 12/13] sh: Emulate one-byte cmpxchg

On Thu, May 02, 2024 at 04:32:35PM -0700, Linus Torvalds wrote:
> On Thu, 2 May 2024 at 16:12, Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > One of RCU's state machines uses smp_store_release() to start the
> > state machine (only one task gets to do this) and cmpxchg() to update
> > state beyond that point.  And the state is 8 bits so that it and other
> > state fits into 32 bits to allow a single check for multiple conditions
> > elsewhere.
> 
> Note that since alpha lacks the release-acquire model, it's always
> going to be a full memory barrier before the store.
> 
> And then the store turns into a load-mask-store for older alphas.
> 
> So it's going to be a complete mess from a performance standpoint regardless.

And on those older machines, a mess functionally because the other
three bytes in that same 32-bit word can be concurrently updated.
Hence Arnd's patch being necessary here.

EV56 and later all have single-byte stores, so they are OK.  They were
introduced in the mid-1990s, so even they are antiques.  ;-)

> Happily, I doubt anybody really cares.

Here is hoping!

> I've occasionally wondered if we have situations where the
> "smp_store_release()" only cares about previous *writes* being ordered
> (ie a "smp_wmb()+WRITE_ONCE" would be sufficient).

Back in the day, rcu_assign_pointer() worked this way.  But later there
were a few use cases where ordering prior reads was needed.

And in this case, we just barely need that full store-release
functionality.  There is a preceding mutex lock-unlock pair that provides
a full barrier post-boot on almost all systems.

> It makes no difference on x86 (all stores are relases), power64 (wmb
> and store_release are both LWSYNC) or arm64 (str is documentated to be
> cheaper than DMB).
> 
> On alpha, smp_wmb()+WRITE_ONCE() is cheaper than smp_store_release(),
> but nobody sane cares.
> 
> But *if* we have a situation where the "smp_store_release()" might be
> just a "previous writes need to be visible" rather than ordering
> previous reads too, we could maybe introduce that kind of op. I
> _think_ the RCU writes tend to be of that kind?

Most of the time, rcu_assign_pointer() only needs to order prior writes,
not both reads and writes.  In theory, we could make an something like
an rcu_assign_pointer_reads_too(), though hopefully with a shorter name,
and go back to smp_wmb() for rcu_assign_pointer().

But in practice, I am having a really hard time convincing myself that
it would be worth it.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ