linux-kernel - Unlock-lock questions and the Linux Kernel Memory Model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1711271553490.1424-100000@iolanthe.rowland.org>
Date:   Mon, 27 Nov 2017 16:16:47 -0500 (EST)
From:   Alan Stern <stern@...land.harvard.edu>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Jade Alglave <j.alglave@....ac.uk>,
        Boqun Feng <boqun.feng@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>,
        David Howells <dhowells@...hat.com>,
        Daniel Lustig <dlustig@...dia.com>,
        Palmer Dabbelt <palmer@...belt.com>
cc:     Kernel development list <linux-kernel@...r.kernel.org>
Subject: Unlock-lock questions and the Linux Kernel Memory Model

This is essentially a repeat of an email I sent out before the
Thanksgiving holiday, the assumption being that lack of any responses
was caused by the holiday break.  (And this time the message is CC'ed
to LKML, so there will be a public record of it.)

A few people have said they believe the Linux Kernel Memory Model
should make unlock followed by lock (of the same variable) act as a
write memory barrier.  In other words, they want the memory model to
forbid the following litmus test:

C unlock-lock-write-ordering-1

{}

P0(int *x, int *y, spinlock_t *s)
{
	spin_lock(s);
	WRITE_ONCE(*x, 1);
	spin_unlock(s);
	spin_lock(s);
	WRITE_ONCE(*y, 1);
	spin_unlock(s);
}

P1(int *x, int *y)
{
	r1 = READ_ONCE(*y);
	smp_rmb();
	r2 = READ_ONCE(*x);
}

exists (1:r1=1 /\ 1:r2=0)

Judging from this, it seems likely that they would want the memory
model to forbid the next two litmus tests as well (since there's never
any guarantee that some other CPU won't interpose a critical section
between a given unlock and a following lock):

C unlock-lock-write-ordering-2

{}

P0(int *x, spinlock_t *s)
{
	spin_lock(s);
	WRITE_ONCE(*x, 1);
	spin_unlock(s);
}

P1(int *x, int *y, spinlock_t *s)
{
	spin_lock(s);
	r1 = READ_ONCE(*x);
	WRITE_ONCE(*y, 1);
	spin_unlock(s);
}

P2(int *x, int *y)
{
	r2 = READ_ONCE(*y);
	smp_rmb();
	r3 = READ_ONCE(*x);
}

exists (1:r1=1 /\ 2:r2=1 /\ 2:r3=0)

C unlock-lock-write-ordering-3

{}

P0(int *x, int *y, int *z, spinlock_t *s)
{
	spin_lock(s);
	WRITE_ONCE(*x, 1);
	spin_unlock(s);
	spin_lock(s);
	r1 = READ_ONCE(*z);
	WRITE_ONCE(*y, 1);
	spin_unlock(s);
}

P1(int *x, int *z, spinlock_t *s)
{
	spin_lock(s);
	r2 = READ_ONCE(*x);
	WRITE_ONCE(*z, 1);
	spin_unlock(s);
}

P2(int *x, int *y)
{
	r3 = READ_ONCE(*y);
	smp_rmb();
	r4 = READ_ONCE(*x);
}

exists (0:r1=1 /\ 1:r2=1 /\ 2:r3=1 /\ 2:r4=0)

The general justification is that all the architectures currently 
supported by the kernel do forbid these tests, and there is code in the 
kernel that depends on this behavior.  Thus the model should forbid 
them.  (Currently the model allows them, on the principle that ordering 
induced by a lock is visible only to CPUs that take the lock.)

On the other hand, whether RISC-V would forbid these tests is not
clear.  Basically, it comes down to using an RCsc versus an RCpc
approach for the locking primitives.

Given that spin_lock() and spin_unlock() derive many of their
properties from acting as an acquire and a release respectively, we can
ask if the memory model should forbid the analogous release-acquire
litmus test:

C rel-acq-write-ordering-3

{}

P0(int *x, int *s, int *y)
{
	WRITE_ONCE(*x, 1);
	smp_store_release(s, 1);
	r1 = smp_load_acquire(s);
	WRITE_ONCE(*y, 1);
}

P1(int *x, int *y)
{
	r2 = READ_ONCE(*y);
	smp_rmb();
	r3 = READ_ONCE(*x);
}

exists (1:r2=1 /\ 1:r3=0)

For that matter, what if we take the initial write to be the 
store-release itself rather than something coming before it?

C rel-acq-write-ordering-1

{}

P0(int *s, int *y)
{
	smp_store_release(s, 1);
	r1 = smp_load_acquire(s);
	WRITE_ONCE(*y, 1);
}

P1(int *s, int *y)
{
	r2 = READ_ONCE(*y);
	smp_rmb();
	r3 = READ_ONCE(*s);
}

exists (1:r2=1 /\ 1:r3=0)

And going to extremes, what if the load-acquire reads from a different 
variable, not the one written by the store-release?  This would be 
like unlocking s and then locking t:

C rel-acq-write-ordering-2

{}

P0(int *s, int *t, int *y)
{
	smp_store_release(s, 1);
	r1 = smp_load_acquire(t);
	WRITE_ONCE(*y, 1);
}

P1(int *s, int *y)
{
	r2 = READ_ONCE(*y);
	smp_rmb();
	r3 = READ_ONCE(*s);
}

exists (1:r2=1 /\ 1:r3=0)

The architectures currently supported by the kernel all forbid this 
last test, but whether we want the model to forbid it is a lot more 
questionable.

I (and others!) would like to know people's opinions on these matters.

Alan Stern