lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Oct 2020 10:11:07 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Florian Weimer <fweimer@...hat.com>,
        linux-toolchains@...r.kernel.org, Will Deacon <will@...nel.org>,
        linux-kernel@...r.kernel.org, stern@...land.harvard.edu,
        parri.andrea@...il.com, boqun.feng@...il.com, npiggin@...il.com,
        dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
        akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
        torvalds@...ux-foundation.org
Subject: Re: Control Dependencies vs C Compilers

On Wed, Oct 07, 2020 at 01:50:54PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 07, 2020 at 12:20:41PM +0200, Florian Weimer wrote:
> > * Peter Zijlstra:

[ . . . ]

> > >> I think in GCC, they are called __atomic_load_n(foo, __ATOMIC_RELAXED)
> > >> and __atomic_store_n(foo, __ATOMIC_RELAXED).  GCC can't optimize relaxed
> > >> MO loads and stores because the C memory model is defective and does not
> > >> actually guarantee the absence of out-of-thin-air values (a property it
> > >> was supposed to have).
> > >
> > > AFAIK people want to get that flaw in the C memory model fixed (which to
> > > me seemd like a very good idea).
> > 
> > It's been a long time since people realized that this problem exists,
> > with several standard releases since then.
> 
> I've been given to believe it is a hard problem. Personally I hold the
> opinion that prohibiting store speculation (of all kinds) is both
> necesary and sufficient to avoid OOTA. But I have 0 proof for that.

There are proofs for some definitions of store speculation, for example,
as proposed by Demsky and Boehm [1] and as prototyped by Demsky's student,
Peizhao Ou [2].  But these require marking all accesses and end up being
optimized variants of acquire load and release store.  One optimization
is that if you have a bunch of loads followed by a bunch of stores,
the compiler can emit a single memory-barrier instruction between the
last load and the first store.

I am not a fan of this approach.

Challenges include:

o	Unmarked accesses.  Compilers are quite aggressive about
	moving normal code.

o	Separately compiled code.  For example, does the compiler have
	unfortunatel optimization opportunities when "volatile if" 
	appears in one translation unit and the dependent stores in
	some other translation unit?

o	LTO, as has already been mentioned in this thread.

Probably other issues as well, but a starting point.

							Thanx, Paul

[1]	https://dl.acm.org/doi/10.1145/2618128.2618134
	"Outlawing ghosts: avoiding out-of-thin-air results"
	Hans-J. Boehm and Brian Demsky.

[2]	https://escholarship.org/uc/item/2vm546k1
	"An Initial Study of Two Approaches to Eliminating Out-of-Thin-Air
	Results" Peizhao Ou.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ