[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210304161142.GB1612307@rowland.harvard.edu>
Date: Thu, 4 Mar 2021 11:11:42 -0500
From: Alan Stern <stern@...land.harvard.edu>
To: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Björn Töpel <bjorn.topel@...il.com>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
parri.andrea@...il.com, Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
Toke Høiland-Jørgensen <toke@...hat.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>
Subject: Re: XDP socket rings, and LKMM litmus tests
On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote:
> Right, I was thinking about something unrelated.. but how about the
> following case:
>
> local_v = &y;
> r1 = READ_ONCE(*x); // f
>
> if (r1 == 1) {
> local_v = &y; // e
> } else {
> local_v = &z; // d
> }
>
> p = READ_ONCE(local_v); // g
>
> r2 = READ_ONCE(*p); // h
>
> if r1 == 1, we definitely think we have:
>
> f ->ctrl e ->rfi g ->addr h
>
> , and if we treat ctrl;rfi as "to-r", then we have "f" happens before
> "h". However compile can optimze the above as:
>
> local_v = &y;
>
> r1 = READ_ONCE(*x); // f
>
> if (r1 != 1) {
> local_v = &z; // d
> }
>
> p = READ_ONCE(local_v); // g
>
> r2 = READ_ONCE(*p); // h
>
> , and when this gets executed, I don't think we have the guarantee we
> have "f" happens before "h", because CPU can do optimistic read for "g"
> and "h".
In your example, which accesses are supposed to be to actual memory and
which to registers? Also, remember that the memory model assumes the
hardware does not reorder loads if there is an address dependency
between them.
> Part of this is because when we take plain access into consideration, we
> won't guarantee a read-from or other relations exists if compiler
> optimization happens.
>
> Maybe I'm missing something subtle, but just try to think through the
> effect of making dep; rfi as "to-r".
Forget about local variables for the time being and just consider
dep ; [Plain] ; rfi
For example:
A: r1 = READ_ONCE(x);
y = r1;
B: r2 = READ_ONCE(y);
Should B be ordered after A? I don't see how any CPU could hope to
excute B before A, but maybe I'm missing something.
There's another twist, connected with the fact that herd7 can't detect
control dependencies caused by unexecuted code. If we have:
A: r1 = READ_ONCE(x);
if (r1)
WRITE_ONCE(y, 5);
r2 = READ_ONCE(y);
B: WRITE_ONCE(z, r2);
then in executions where x == 0, herd7 doesn't see any control
dependency. But CPUs do see control dependencies whenever there is a
conditional branch, whether the branch is taken or not, and so they will
never reorder B before A.
One last thing to think about: My original assessment or Björn's problem
wasn't right, because the dep in (dep ; rfi) doesn't include control
dependencies. Only data and address. So I believe that the LKMM
wouldn't consider A to be ordered before B in this example even if x
was nonzero.
Alan
Powered by blists - more mailing lists