[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3dfe7daed3c44f46a6989b6513ad7bb0@AcuMS.aculab.com>
Date: Tue, 6 Oct 2020 12:37:06 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Peter Zijlstra' <peterz@...radead.org>,
"linux-toolchains@...r.kernel.org" <linux-toolchains@...r.kernel.org>,
"Will Deacon" <will@...nel.org>, Paul McKenney <paulmck@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stern@...land.harvard.edu" <stern@...land.harvard.edu>,
"parri.andrea@...il.com" <parri.andrea@...il.com>,
"boqun.feng@...il.com" <boqun.feng@...il.com>,
"npiggin@...il.com" <npiggin@...il.com>,
"dhowells@...hat.com" <dhowells@...hat.com>,
"j.alglave@....ac.uk" <j.alglave@....ac.uk>,
"luc.maranget@...ia.fr" <luc.maranget@...ia.fr>,
"akiyks@...il.com" <akiyks@...il.com>,
"dlustig@...dia.com" <dlustig@...dia.com>,
"joel@...lfernandes.org" <joel@...lfernandes.org>,
"torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>
Subject: RE: Control Dependencies vs C Compilers
From: Peter Zijlstra
> Sent: 06 October 2020 12:47
> Hi,
>
> Let's give this linux-toolchains thing a test-run...
>
> As some of you might know, there's a bit of a discrepancy between what
> compiler and kernel people consider 'valid' use of the compiler :-)
>
> One area where this shows up is in implicit (memory) ordering provided
> by the hardware, which we kernel people would like to use to avoid
> explicit fences (expensive) but which the compiler is unaware of and
> could ruin (bad).
...
>
> In short, the control dependency relies on the hardware never
> speculating stores (instant OOTA) to provide a LOAD->STORE ordering.
> That is, a LOAD must be completed to resolve a conditional branch, the
> STORE is after the branch and cannot be made visible until the branch is
> determined (which implies the load is complete).
>
> However, our 'dear' C language has no clue of any of this.
>
> So given code like:
>
> x = *foo;
> if (x > 42)
> *bar = 1;
>
> Which, if literally translated into assembly, would provide a
> LOAD->STORE order between foo and bar, could, in the hands of an
> evil^Woptimizing compiler, become:
>
> x = *foo;
> *bar = 1;
>
> because it knows, through value tracking, that the condition must be
> true.
>
> Our Documentation/memory-barriers.txt has a Control Dependencies section
> (which I shall not replicate here for brevity) which lists a number of
> caveats. But in general the work-around we use is:
>
> x = READ_ONCE(*foo);
> if (x > 42)
> WRITE_ONCE(*bar, 1);
An alternative is to 'persuade' the compiler that
any 'tracked' value for a local variable is invalid.
Rather like the way that barrier() 'invalidates' memory.
So you generate:
x = *foo
asm ("" : "+r" (x));
if (x > 42)
*bar = 1;
Since the "+r" constraint indicates that the value of 'x'
might have changed it can't optimise based on any
presumed old value.
(Unless it looks inside the asm opcodes...)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists