[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201007171107.GO29330@paulmck-ThinkPad-P72>
Date: Wed, 7 Oct 2020 10:11:07 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Florian Weimer <fweimer@...hat.com>,
linux-toolchains@...r.kernel.org, Will Deacon <will@...nel.org>,
linux-kernel@...r.kernel.org, stern@...land.harvard.edu,
parri.andrea@...il.com, boqun.feng@...il.com, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
torvalds@...ux-foundation.org
Subject: Re: Control Dependencies vs C Compilers
On Wed, Oct 07, 2020 at 01:50:54PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 07, 2020 at 12:20:41PM +0200, Florian Weimer wrote:
> > * Peter Zijlstra:
[ . . . ]
> > >> I think in GCC, they are called __atomic_load_n(foo, __ATOMIC_RELAXED)
> > >> and __atomic_store_n(foo, __ATOMIC_RELAXED). GCC can't optimize relaxed
> > >> MO loads and stores because the C memory model is defective and does not
> > >> actually guarantee the absence of out-of-thin-air values (a property it
> > >> was supposed to have).
> > >
> > > AFAIK people want to get that flaw in the C memory model fixed (which to
> > > me seemd like a very good idea).
> >
> > It's been a long time since people realized that this problem exists,
> > with several standard releases since then.
>
> I've been given to believe it is a hard problem. Personally I hold the
> opinion that prohibiting store speculation (of all kinds) is both
> necesary and sufficient to avoid OOTA. But I have 0 proof for that.
There are proofs for some definitions of store speculation, for example,
as proposed by Demsky and Boehm [1] and as prototyped by Demsky's student,
Peizhao Ou [2]. But these require marking all accesses and end up being
optimized variants of acquire load and release store. One optimization
is that if you have a bunch of loads followed by a bunch of stores,
the compiler can emit a single memory-barrier instruction between the
last load and the first store.
I am not a fan of this approach.
Challenges include:
o Unmarked accesses. Compilers are quite aggressive about
moving normal code.
o Separately compiled code. For example, does the compiler have
unfortunatel optimization opportunities when "volatile if"
appears in one translation unit and the dependent stores in
some other translation unit?
o LTO, as has already been mentioned in this thread.
Probably other issues as well, but a starting point.
Thanx, Paul
[1] https://dl.acm.org/doi/10.1145/2618128.2618134
"Outlawing ghosts: avoiding out-of-thin-air results"
Hans-J. Boehm and Brian Demsky.
[2] https://escholarship.org/uc/item/2vm546k1
"An Initial Study of Two Approaches to Eliminating Out-of-Thin-Air
Results" Peizhao Ou.
Powered by blists - more mailing lists