lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 7 Jun 2021 11:08:26 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Will Deacon <will@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Segher Boessenkool <segher@...nel.crashing.org>,
        Andrea Parri <parri.andrea@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Nick Piggin <npiggin@...il.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Akira Yokosawa <akiyks@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-toolchains@...r.kernel.org,
        linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [RFC] LKMM: Add volatile_if()

On Mon, Jun 07, 2021 at 05:02:53PM +0100, Will Deacon wrote:
> Hi Paul,
> 
> On Mon, Jun 07, 2021 at 08:25:33AM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 07, 2021 at 12:52:35PM +0100, Will Deacon wrote:
> > > It's the conditional instructions that are more fun. For example, the CSEL
> > > instruction:
> > > 
> > > 	CSEL	X0, X1, X2, <cond>
> > > 
> > > basically says:
> > > 
> > > 	if (cond)
> > > 		X0 = X1;
> > > 	else
> > > 		X0 = X2;
> > > 
> > > these are just register-register operations, but the idea is that the CPU
> > > can predict that "branching event" inside the CSEL instruction and
> > > speculatively rename X0 while waiting for the condition to resolve.
> > > 
> > > So then you can add loads and stores to the mix along the lines of:
> > > 
> > > 	LDR	X0, [X1]		// X0 = *X1
> > > 	CMP	X0, X2
> > > 	CSEL	X3, X4, X5, EQ		// X3 = (X0 == X2) ? X4 : X5
> > > 	STR	X3, [X6]		// MUST BE ORDERED AFTER THE LOAD
> > > 	STR	X7, [X8]		// Can be reordered
> > > 
> > > (assuming X1, X6, X8 all point to different locations in memory)
> > > 
> > > So now we have a dependency from the load to the first store, but the
> > > interesting part is that the last store is _not_ ordered wrt either of the
> > > other two memory accesses, whereas it would be if we used a conditional
> > > branch instead of the CSEL. Make sense?
> > 
> > And if I remember correctly, this is why LKMM orders loads in the
> > "if" condition only with stores in the "then" and "else" clauses,
> > not with stores after the end of the "if" statement.  Or is there
> > some case that I am missing?
> 
> It's not clear to me that such a restriction prevents the compiler from
> using any of the arm64 conditional instructions in place of the conditional
> branch in such a way that you end up with an "independent" store in the
> assembly output constructed from two stores on the "then" and "else" paths
> which the compiler determined where the same.
> 
> > > Now, obviously the compiler is blissfully unaware that conditional
> > > data processing instructions can give rise to dependencies than
> > > conditional branches, so the question really is how much do we need to
> > > care in the kernel?
> > > 
> > > My preference is to use load-acquire instead of control dependencies so
> > > that we don't have to worry about this, or any future relaxations to the
> > > CPU architecture, at all.
> > 
> > From what I can see, ARMv8 has DMB(LD) and DMB(ST).  Does it have
> > something like a DMB(LD,ST) that would act something like powerpc lwsync?
> > 
> > Or are you proposing rewriting the "if" conditions to upgrade
> > READ_ONCE() to smp_load_acquire()?  Or something else?
> > 
> > Just trying to find out exactly what you are proposing.  ;-)
> 
> Some options are:
> 
>  (1) Do nothing until something actually goes wrong (and hope we spot/debug it)
> 
>  (2) Have volatile_if force a conditional branch, assuming that it solves
>      the problem and doesn't hurt codegen (I still haven't convinced myself
>      for either case)
> 
>  (3) Upgrade READ_ONCE() to RCpc acquire, relaxed atomic RMWs to RCsc
>      acquire on arm64
> 
>  (4) Introduce e.g. READ_ONCE_CTRL(), atomic_add_return_ctrl() etc
>      specifically for control dependencies and upgrade only those for
>      arm64
> 
>  (5) Work to get toolchain support for dependency ordering and use that
> 
> I'm suggesting (3) or (4) because, honestly, it feels like we're being
> squeezed from both sides with both the compiler and the hardware prepared
> to break control dependencies.

I will toss out this as well:

  (6) Create a volatile_if() that does not support an "else" clause,
      thus covering all current use cases and avoiding some of the
      same-store issues.  Which in the end might or might not help,
      but perhaps worth looking into.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ