lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgG6Dmt1JTXDbrbXh_6s2yLjL=9pHo7uv0==LHFD+aBtg@mail.gmail.com>
Date: Wed, 6 Mar 2024 10:43:25 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, linke li <lilinke99@...com>, joel@...lfernandes.org, 
	boqun.feng@...il.com, dave@...olabs.net, frederic@...nel.org, 
	jiangshanlai@...il.com, josh@...htriplett.org, linux-kernel@...r.kernel.org, 
	mathieu.desnoyers@...icios.com, qiang.zhang1211@...il.com, 
	quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer()
 data race and concurrency bug

On Wed, 6 Mar 2024 at 09:59, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> IIRC, the original purpose of READ_ONCE() and WRITE_ONCE() was to make sure
> that the compiler only reads or writes the variable "once". Hence the name.
> That way after a load, you don't need to worry that the content of the
> variable you read isn't going to be read again from the original location
> because the compiler decided to save stack space and registers.
>
> But that macro has now been extended for other purposes.

Not really.

Tearing of simple types (as opposed to structures or bitfields or
"more than one word" or whatever) has never really been a real
concern.

It keeps being brought up as a "compilers could do this", but it's
basically just BS fear-mongering. Compilers _don't_ do it, and the
reason why compilers don't do it isn't some "compilers are trying to
be nice" issue, but simply a "it is insane and generates worse code"
issue.

So what happens is that READ_ONCE() and WRITE_ONCE() have always been
about reading and writing *consistent* values. There is no locking,
but the idea is - and has always been - that you get one *single*
answer from READ_ONCE(), and that single answer will always be
consistent with something that has been written by WRITE_ONCE.

That's often useful - lots of code doesn't really care if you get the
old or the new value, but the code *does* care that it gets *one*
value, and not some random mix of "I tested one value for validity,
then it got reloaded due to register pressure, and I actually used
another value".

And not some "I read one value, and it was a mix of two other values".

But in order to get those semantics, the READ_ONCE() and WRITE_ONCE()
macros don't do just the 'volatile' (to get the "no reloads"
guarantee), but they also do that "simple types" check.

So READ_ONCE/WRITE_ONCE has never really been "extended for other
purposes". The purpose has always been the same: one single consistent
value.

What did happen that our *original* name for this was not "read vs
write", but just "access".

So instead of "READ_ONCE(x)" you'd do "ACCESS_ONCE(x)", and instead of
"WRITE_ONCE(x,y)" you'd do "ACCESS_ONCE(x) = y".

And, to make matters more interesting, we had code that did that on
things that were *not* simple values. IOW, we'd have things like
ACCESS_ONCE() on things that literally *couldn't* be accessed as one
single value.

The most notable was accessing page table entries, which on multiple
architectures (including plain old 32-bit x86) ended up being two
words.

So the extension that *did* happen is that READ_ONCE and WRITE_ONCE
actually verify that the type is simple, and that you can't do a
64-bit READ_ONCE on a 32-bit architecture. Because then while you
migth guarantee that the value isn't reloaded multiple times, you
cannot guarantee that you actually get a value that is consistent with
a WRITE_ONCE (because the reads and writes are both two operations).

Now, we've gotten rid of the whole ACCESS_ONCE() thing, and so some of
that history is no longer visible (although you can still see that
pattern in the rseq self-tests).

So yes, READ_ONCE/WRITE_ONCE do control "tearing", but realistically,
it was always only about the "complex values" kind of tearing that the
old ACCESS_ONCE() model silently and incorrectly allowed.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ