[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190613140204.GD18966@fuggles.cambridge.arm.com>
Date: Thu, 13 Jun 2019 15:02:04 +0100
From: Will Deacon <will.deacon@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: stern@...land.harvard.edu, akiyks@...il.com,
andrea.parri@...rulasolutions.com, boqun.feng@...il.com,
dlustig@...dia.com, dhowells@...hat.com, j.alglave@....ac.uk,
luc.maranget@...ia.fr, npiggin@...il.com, paulmck@...ux.ibm.com,
paul.burton@...s.com, linux-kernel@...r.kernel.org,
torvalds@...ux-foundation.org
Subject: Re: [PATCH v2 4/4] x86/atomic: Fix smp_mb__{before,after}_atomic()
On Thu, Jun 13, 2019 at 03:43:21PM +0200, Peter Zijlstra wrote:
> Recent probing at the Linux Kernel Memory Model uncovered a
> 'surprise'. Strongly ordered architectures where the atomic RmW
> primitive implies full memory ordering and
> smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
> fail for:
>
> *x = 1;
> atomic_inc(u);
> smp_mb__after_atomic();
> r0 = *y;
>
> Because, while the atomic_inc() implies memory order, it
> (surprisingly) does not provide a compiler barrier. This then allows
> the compiler to re-order like so:
>
> atomic_inc(u);
> *x = 1;
> smp_mb__after_atomic();
> r0 = *y;
>
> Which the CPU is then allowed to re-order (under TSO rules) like:
>
> atomic_inc(u);
> r0 = *y;
> *x = 1;
>
> And this very much was not intended. Therefore strengthen the atomic
> RmW ops to include a compiler barrier.
>
> NOTE: atomic_{or,and,xor} and the bitops already had the compiler
> barrier.
>
> Reported-by: Andrea Parri <andrea.parri@...rulasolutions.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> Documentation/atomic_t.txt | 3 +++
> arch/x86/include/asm/atomic.h | 8 ++++----
> arch/x86/include/asm/atomic64_64.h | 8 ++++----
> arch/x86/include/asm/barrier.h | 4 ++--
> 4 files changed, 13 insertions(+), 10 deletions(-)
>
> --- a/Documentation/atomic_t.txt
> +++ b/Documentation/atomic_t.txt
> @@ -194,6 +194,9 @@ These helper barriers exist because arch
> ordering on their SMP atomic primitives. For example our TSO architectures
> provide full ordered atomics and these barriers are no-ops.
>
> +NOTE: when the atomic RmW ops are fully ordered, they should also imply a
> +compiler barrier.
Acked-by: Will Deacon <will.deacon@....com>
Cheers,
Will
Powered by blists - more mailing lists