linux-kernel - Re: [PATCH v2 0/4] atomic: Fixes to smp_mb__{before,after}

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190613180029.GO3436@hirez.programming.kicks-ass.net>
Date:   Thu, 13 Jun 2019 20:00:29 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     David Howells <dhowells@...hat.com>, akiyks@...il.com,
        andrea.parri@...rulasolutions.com, boqun.feng@...il.com,
        dlustig@...dia.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
        npiggin@...il.com, paulmck@...ux.ibm.com, will.deacon@....com,
        paul.burton@...s.com, linux-kernel@...r.kernel.org,
        torvalds@...ux-foundation.org
Subject: Re: [PATCH v2 0/4] atomic: Fixes to smp_mb__{before,after}_atomic()
 and mips.

On Thu, Jun 13, 2019 at 12:58:11PM -0400, Alan Stern wrote:
> On Thu, 13 Jun 2019, David Howells wrote:
> 
> > Peter Zijlstra <peterz@...radead.org> wrote:
> > 
> > > Basically we fail for:
> > > 
> > > 	*x = 1;
> > > 	atomic_inc(u);
> > > 	smp_mb__after_atomic();
> > > 	r0 = *y;
> > > 
> > > Because, while the atomic_inc() implies memory order, it
> > > (surprisingly) does not provide a compiler barrier. This then allows
> > > the compiler to re-order like so:
> > 
> > To quote memory-barriers.txt:
> > 
> >  (*) smp_mb__before_atomic();
> >  (*) smp_mb__after_atomic();
> > 
> >      These are for use with atomic (such as add, subtract, increment and
> >      decrement) functions that don't return a value, especially when used for
> >      reference counting.  These functions do not imply memory barriers.
> > 
> > so it's entirely to be expected?
> 
> The text is perhaps ambiguous.  It means that the atomic functions
> which don't return values -- like atomic_inc() -- do not imply memory
> barriers.  It doesn't mean that smp_mb__before_atomic() and
> smp_mb__after_atomic() do not imply memory barriers.
> 
> The behavior Peter described is not to be expected.  The expectation is 
> that the smp_mb__after_atomic() in the example should force the "*x = 
> 1" store to execute before the "r0 = *y" load.  But on current x86 it 
> doesn't force this, for the reason explained in the description.

Indeed, thanks Alan.

The other other approach would be to upgrade smp_mb__{before,after}_mb()
to actual full memory barriers on x86, but that seems quite rediculous
since atomic_inc() already does all the expensive bits and is only
missing the compiler barrier.

That would result in code like:

	mov $1, x
	lock inc u
	lock addl $0, -4(%rsp) # aka smp_mb()
	mov y, %r

which is really quite silly.

And as noted in the Changelog, about half the non-value returning
atomics already implied the compiler barrier anyway.