lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Nov 2017 10:58:50 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Jade Alglave <j.alglave@....ac.uk>,
        Boqun Feng <boqun.feng@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        Will Deacon <will.deacon@....com>,
        David Howells <dhowells@...hat.com>,
        Daniel Lustig <dlustig@...dia.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Unlock-lock questions and the Linux Kernel Memory Model

On Mon, Nov 27, 2017 at 04:16:47PM -0500, Alan Stern wrote:
> This is essentially a repeat of an email I sent out before the
> Thanksgiving holiday, the assumption being that lack of any responses
> was caused by the holiday break.  (And this time the message is CC'ed
> to LKML, so there will be a public record of it.)

Right, thanks! No turkey days on this side of the pond, but it still got
lost in the noise, sorry about that.

> A few people have said they believe the Linux Kernel Memory Model
> should make unlock followed by lock (of the same variable) act as a
> write memory barrier.  In other words, they want the memory model to
> forbid the following litmus test:
> 
> 
> C unlock-lock-write-ordering-1
> 
> {}
> 
> P0(int *x, int *y, spinlock_t *s)
> {
> 	spin_lock(s);
> 	WRITE_ONCE(*x, 1);
> 	spin_unlock(s);
> 	spin_lock(s);
> 	WRITE_ONCE(*y, 1);
> 	spin_unlock(s);
> }
> 
> P1(int *x, int *y)
> {
> 	r1 = READ_ONCE(*y);
> 	smp_rmb();
> 	r2 = READ_ONCE(*x);
> }
> 
> exists (1:r1=1 /\ 1:r2=0)

ACK, that is the pattern I was using.

> Judging from this, it seems likely that they would want the memory
> model to forbid the next two litmus tests as well (since there's never
> any guarantee that some other CPU won't interpose a critical section
> between a given unlock and a following lock):
> 
> 
> C unlock-lock-write-ordering-2
> 
> {}
> 
> P0(int *x, spinlock_t *s)
> {
> 	spin_lock(s);
> 	WRITE_ONCE(*x, 1);
> 	spin_unlock(s);
> }
> 
> P1(int *x, int *y, spinlock_t *s)
> {
> 	spin_lock(s);
> 	r1 = READ_ONCE(*x);
> 	WRITE_ONCE(*y, 1);
> 	spin_unlock(s);
> }
> 
> P2(int *x, int *y)
> {
> 	r2 = READ_ONCE(*y);
> 	smp_rmb();
> 	r3 = READ_ONCE(*x);
> }
> 
> exists (1:r1=1 /\ 2:r2=1 /\ 2:r3=0)

Agreed, I would indeed also expect that to 'work'.

> C unlock-lock-write-ordering-3
> 
> {}
> 
> P0(int *x, int *y, int *z, spinlock_t *s)
> {
> 	spin_lock(s);
> 	WRITE_ONCE(*x, 1);
> 	spin_unlock(s);
> 	spin_lock(s);
> 	r1 = READ_ONCE(*z);
> 	WRITE_ONCE(*y, 1);
> 	spin_unlock(s);
> }
> 
> P1(int *x, int *z, spinlock_t *s)
> {
> 	spin_lock(s);
> 	r2 = READ_ONCE(*x);
> 	WRITE_ONCE(*z, 1);
> 	spin_unlock(s);
> }
> 
> P2(int *x, int *y)
> {
> 	r3 = READ_ONCE(*y);
> 	smp_rmb();
> 	r4 = READ_ONCE(*x);
> }
> 
> exists (0:r1=1 /\ 1:r2=1 /\ 2:r3=1 /\ 2:r4=0)

And this is the same except with one more link in the chain, right? I
would've put it in P3 or somesuch but no matter.

> The general justification is that all the architectures currently
> supported by the kernel do forbid these tests, and there is code in
> the kernel that depends on this behavior.  Thus the model should
> forbid them.  (Currently the model allows them, on the principle that
> ordering induced by a lock is visible only to CPUs that take the
> lock.)
> 
> On the other hand, whether RISC-V would forbid these tests is not
> clear.  Basically, it comes down to using an RCsc versus an RCpc
> approach for the locking primitives.

Not entirely, as Power is currently RCpc but still forbids these.

> Given that spin_lock() and spin_unlock() derive many of their
> properties from acting as an acquire and a release respectively, we can
> ask if the memory model should forbid the analogous release-acquire
> litmus test:

> C rel-acq-write-ordering-3
> 
> {}
> 
> P0(int *x, int *s, int *y)
> {
> 	WRITE_ONCE(*x, 1);
> 	smp_store_release(s, 1);
> 	r1 = smp_load_acquire(s);
> 	WRITE_ONCE(*y, 1);
> }
> 
> P1(int *x, int *y)
> {
> 	r2 = READ_ONCE(*y);
> 	smp_rmb();
> 	r3 = READ_ONCE(*x);
> }
> 
> exists (1:r2=1 /\ 1:r3=0)

Ideally yes, consistency is good.

> For that matter, what if we take the initial write to be the 
> store-release itself rather than something coming before it?
> 
> 
> C rel-acq-write-ordering-1
> 
> {}
> 
> P0(int *s, int *y)
> {
> 	smp_store_release(s, 1);
> 	r1 = smp_load_acquire(s);
> 	WRITE_ONCE(*y, 1);
> }
> 
> P1(int *s, int *y)
> {
> 	r2 = READ_ONCE(*y);
> 	smp_rmb();
> 	r3 = READ_ONCE(*s);
> }
> 
> exists (1:r2=1 /\ 1:r3=0)

I expect it would take extra work to make this one fail while the
previous one works, no?

That is, given the things so far, it would only be consistent for this
to also work.

> And going to extremes, what if the load-acquire reads from a different 
> variable, not the one written by the store-release?  This would be 
> like unlocking s and then locking t:
> 
> 
> C rel-acq-write-ordering-2
> 
> {}
> 
> P0(int *s, int *t, int *y)
> {
> 	smp_store_release(s, 1);
> 	r1 = smp_load_acquire(t);
> 	WRITE_ONCE(*y, 1);
> }
> 
> P1(int *s, int *y)
> {
> 	r2 = READ_ONCE(*y);
> 	smp_rmb();
> 	r3 = READ_ONCE(*s);
> }
> 
> exists (1:r2=1 /\ 1:r3=0)
> 
> 
> The architectures currently supported by the kernel all forbid this 
> last test, but whether we want the model to forbid it is a lot more 
> questionable.
> 
> I (and others!) would like to know people's opinions on these matters.

This one is interesting... Yes it will work on current hardware, but I'm
not sure I'd expect this. I'll ponder this a wee bit more.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ