linux-kernel - Re: [RFC][PATCH v3]: documentation,atomic: Add new documents

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170802094531.GA15748@arm.com>
Date:   Wed, 2 Aug 2017 10:45:32 +0100
From:   Will Deacon <will.deacon@....com>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Boqun Feng <boqun.feng@...il.com>,
        linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [RFC][PATCH v3]: documentation,atomic: Add new documents

Hi Paul,

On Tue, Aug 01, 2017 at 09:14:12AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 01, 2017 at 01:17:13PM +0100, Will Deacon wrote:
> > On Tue, Aug 01, 2017 at 01:47:44PM +0200, Peter Zijlstra wrote:
> > > On Tue, Aug 01, 2017 at 11:19:00AM +0100, Will Deacon wrote:
> > > > On Tue, Aug 01, 2017 at 11:01:21AM +0200, Peter Zijlstra wrote:
> > > > > On Mon, Jul 31, 2017 at 10:43:45AM -0700, Paul E. McKenney wrote:
> > > > > > So if I have something like this, the assertion really can trigger?
> > > > > > 
> > > > > > 	WRITE_ONCE(x, 1);		atomic_inc(&y);
> > > > > > 	r0 = xchg_release(&y, 5);	smp_mb__after_atomic();
> > > > > > 					r1 = READ_ONCE(x);
> > > > > > 
> > > > > > 
> > > > > > 	WARN_ON(r0 == 0 && r1 == 0);
> > > > > > 
> > > > > > I must confess that I am not seeing why we would want to allow this
> > > > > > outcome.
> > > > > 
> > > > > No you are indeed quite right. I just wasn't creative enough. Thanks for
> > > > > the inspiration.
> > > > 
> > > > Just to close this out, we agree that an smp_rmb() instead of
> > > > smp_mb__after_atomic() would *not* forbid this outcome, right?
> > > 
> > > So that really hurts my brain. Per the normal rules that smp_rmb() would
> > > order the read of @x against the last ll of @y and per ll/sc ordering
> > > you then still don't get to make the WARN happen.
> > > 
> > > On IRC you explained that your 8.1 LSE instructions are not in fact
> > > ordered by a smp_rmb, only by smp_wmb, which is 'surprising' since you
> > > really need to load the old value to compute the new value.
> > 
> > To be clear, it's only the ST* variants of the LSE instructions that are
> > treated as a write for the purposes of memory ordering, so these are the
> > non-*_return variants. It's not unlikely that other architectures will
> > exhibit the same behaviour (e.g. Power, RISC-V), because the CPU can
> > treat non-return atomics as "fire-and-forget" and have them handled
> > elsewhere in the memory subsystem, causing them to be treated similarly
> > to posted writes.
> > 
> > For the code snippet above, the second thread has no idea about the value
> > of y and so smp_rmb() is the wrong thing to be using imo. It really cares
> > about ordering the store to y before the read of x, so needs a full mb (i.e.
> > the test is more like 'R' than 'MP').
> > 
> > Also, wouldn't this problem also arise if your atomics were built using a
> > spinlock where unlock had release semantics?
> 
> The current Linux kernel memory model forbids this outcome with smp_rmb(),
> though I did have to work around the current lack of atomic_inc() using
> xchg_relaxed(), so please review my litmus tests carefully.

It's worth noting that we don't have the problem with any value-returning
atomics, so all flavours of xchg in this test would be forbidden on arm64
too.

> 	C C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> 
> 	(*
> 	 * Expected result: Never.
> 	 *
> 	 * Desired litmus test, with atomic_inc() emulated by xchg_relaxed():
> 	 *
> 	 *     WRITE_ONCE(x, 1);               atomic_inc(&y);
> 	 *     r0 = xchg_release(&y, 5);       smp_rmb();
> 	 *                                     r1 = READ_ONCE(x);
> 	 *
> 	 *
> 	 *     WARN_ON(r0 == 0 && r1 == 0);
> 	 *)
> 
> 	{
> 	}
> 
> 	P0(int *x, int *y)
> 	{
> 		WRITE_ONCE(*x, 1);
> 		r0 = xchg_release(y, 5);
> 	}
> 
> 	P1(int *x, int *y)
> 	{
> 		r2 = xchg_relaxed(y, 1);
> 		smp_rmb();
> 		r1 = READ_ONCE(*x);
> 	}
> 
> 	exists
> 	(0:r0=0 /\ 1:r1=0)
> 
> Here is what herd thinks:
> 
> 	$ herd7 -bell strong-kernel.bell -cat weak-kernel.cat -macros linux.def ../litmus/manual/kernel/C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> 	Test C-WillDeacon-MP+o-r+ai-rmb-o Allowed
> 	States 3
> 	0:r0=0; 1:r1=1;
> 	0:r0=1; 1:r1=0;
> 	0:r0=1; 1:r1=1;
> 	No
> 	Witnesses
> 	Positive: 0 Negative: 3
> 	Condition exists (0:r0=0 /\ 1:r1=0)
> 	Observation C-WillDeacon-MP+o-r+ai-rmb-o Never 0 3
> 	Hash=0c3e25a94b38708a2c5ea11ff52c8077
> 
> I get the same answer from strong-kernel.cat (which is our best-guess
> envelope over hardware guarantees), weak-kernel.cat (which is simplified
> based on what people actually use), and proposal.cat (which is a candidate
> model with further simplifications).
> 
> I converted this (possibly incorrectly) to PowerPC assembly:
> 
> 	PPC w-RMWl-r+w-RMWl-r.litmus
> 	""
> 	(*
> 	 * Does 3.0 Linux-kernel Power atomic_add_return() provide local 
> 	 * barrier that orders prior stores against subsequent loads?
> 	 * Use the atomic_add_return() in both threads, but to different variables.
> 	 * And use the trailing-lwsync variant of atomic_add_return().
> 	 *)
> 	(* 24-Aug-2011: ppcmem says "Sometimes" *)
> 	{
> 	0:r1=1; 0:r2=x; 0:r3=5; 0:r4=y;   0:r10=0 ; 0:r11=0;
> 	1:r1=1; 1:r2=x; 1:r3=5; 1:r4=y;   1:r10=0 ; 1:r11=0;
> 	}
> 	 P0                | P1                ;
> 	 stw r1,0(r2)      | lwarx  r11,r10,r4 ;
> 	 lwsync            | stwcx. r1,r10,r4  ;
> 	 lwarx  r11,r10,r4 | bne Fail1         ;
> 	 stwcx. r3,r10,r4  | lwsync            ;
> 	 bne Fail0         | lwz r3,0(r2)      ;
> 	 li r3,42          | Fail1:            ;
> 	 Fail0:            |                   ;
> 
> 
> 	exists
> 	(0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> 
> And ppcmem agrees with the linux-kernel memory model:
> 
> 	[ . . . ]
> 
> 	Found     82 : Prune count= 13946  seen_succs=  7453   7454 states 
> 	Found     83 : Prune count= 13997  seen_succs=  7490   7491 states 
> 	Found     84 : Prune count= 14007  seen_succs=  7506   7507 states 
> 	Found     85 : Prune count= 17229  seen_succs=  8889   8890 states 
> 	Found     86 : Prune count= 17235  seen_succs=  8897   8898 states 
> 	Test w-RMWl-r+w-RMWl-r Allowed
> 	States 9
> 	0:r3=5; 0:r11=0; 1:r3=0;
> 	0:r3=5; 0:r11=0; 1:r3=1;
> 	0:r3=5; 0:r11=0; 1:r3=5;
> 	0:r3=5; 0:r11=1; 1:r3=0;
> 	0:r3=5; 0:r11=1; 1:r3=1;
> 	0:r3=42; 0:r11=0; 1:r3=1;
> 	0:r3=42; 0:r11=0; 1:r3=5;
> 	0:r3=42; 0:r11=1; 1:r3=0;
> 	0:r3=42; 0:r11=1; 1:r3=1;
> 	No (allowed not found)
> 	Condition exists (0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> 	Hash=58fb07516ac5697580c33e06a354f667
> 	Observation w-RMWl-r+w-RMWl-r Never 0 9 
> 
> So if ARM really needs the litmus test with smp_rmb() to be allowed,
> we need to adjust the Linux-kernel memory model appropriately.  Which
> means that one of us needs to reach out to the usual suspects.  Would
> you like to do that, or would you like me to?

If you don't mind doing it, then that would be great, thanks. Do shout if
you need me to help with anything, though!

Will