[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190927124929.GB4643@worktop.programming.kicks-ass.net>
Date: Fri, 27 Sep 2019 14:49:29 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Andrea Parri <parri.andrea@...il.com>
Cc: David Howells <dhowells@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Will Deacon <will@...nel.org>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Mark Rutland <mark.rutland@....com>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
jose.marchesi@...cle.com
Subject: Re: Do we need to correct barriering in circular-buffers.rst?
On Fri, Sep 27, 2019 at 11:51:07AM +0200, Andrea Parri wrote:
> For the record, the LKMM doesn't currently model "order" derived from
> control dependencies to a _plain_ access (even if the plain access is
> a write): in particular, the following is racy (as far as the current
> LKMM is concerned):
>
> C rb
>
> { }
>
> P0(int *tail, int *data, int *head)
> {
> if (READ_ONCE(*tail)) {
> *data = 1;
> smp_wmb();
> WRITE_ONCE(*head, 1);
> }
> }
>
> P1(int *tail, int *data, int *head)
> {
> int r0;
> int r1;
>
> r0 = READ_ONCE(*head);
> smp_rmb();
> r1 = *data;
> smp_mb();
> WRITE_ONCE(*tail, 1);
> }
>
> Replacing the plain "*data = 1" with "WRITE_ONCE(*data, 1)" (or doing
> s/READ_ONCE(*tail)/smp_load_acquire(tail)) suffices to avoid the race.
> Maybe I'm short of imagination this morning... but I can't currently
> see how the compiler could "break" the above scenario.
The compiler; if sufficiently smart; is 'allowed' to change P0 into
something terrible like:
*data = 1;
if (*tail) {
smp_wmb();
*head = 1;
} else
*data = 0;
(assuming it knows *data was 0 from a prior store or something)
Using WRITE_ONCE() defeats this because volatile indicates external
visibility.
> I also didn't spend much time thinking about it. memory-barriers.txt
> has a section "CONTROL DEPENDENCIES" dedicated to "alerting developers
> using control dependencies for ordering". That's quite a long section
> (and probably still incomplete); the last paragraph summarizes: ;-)
Barring LTO the above works for perf because of inter-translation-unit
function calls, which imply a compiler barrier.
Now, when the compiler inlines, it looses that sync point (and thereby
subtlely changes semantics from the non-inline variant). I suspect LTO
does the same and can cause subtle breakage through this transformation.
> (*) Compilers do not understand control dependencies. It is therefore
> your job to ensure that they do not break your code.
It is one the list of things I want to talk about when I finally get
relevant GCC and LLVM people in the same room ;-)
Ideally the compiler can be taught to recognise conditionals dependent
on 'volatile' loads and disallow problematic transformations around
them.
I've added Nick (clang) and Jose (GCC) on Cc, hopefully they can help
find the right people for us.
Powered by blists - more mailing lists