[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YEGFfjmOYfbuir9o@boqun-archlinux>
Date: Fri, 5 Mar 2021 09:12:30 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Björn Töpel <bjorn.topel@...il.com>,
bpf <bpf@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
parri.andrea@...il.com, Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, npiggin@...il.com,
dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
Toke Høiland-Jørgensen <toke@...hat.com>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>
Subject: Re: XDP socket rings, and LKMM litmus tests
On Thu, Mar 04, 2021 at 11:11:42AM -0500, Alan Stern wrote:
> On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote:
>
> > Right, I was thinking about something unrelated.. but how about the
> > following case:
> >
> > local_v = &y;
> > r1 = READ_ONCE(*x); // f
> >
> > if (r1 == 1) {
> > local_v = &y; // e
> > } else {
> > local_v = &z; // d
> > }
> >
> > p = READ_ONCE(local_v); // g
> >
> > r2 = READ_ONCE(*p); // h
> >
> > if r1 == 1, we definitely think we have:
> >
> > f ->ctrl e ->rfi g ->addr h
> >
> > , and if we treat ctrl;rfi as "to-r", then we have "f" happens before
> > "h". However compile can optimze the above as:
> >
> > local_v = &y;
> >
> > r1 = READ_ONCE(*x); // f
> >
> > if (r1 != 1) {
> > local_v = &z; // d
> > }
> >
> > p = READ_ONCE(local_v); // g
> >
> > r2 = READ_ONCE(*p); // h
> >
> > , and when this gets executed, I don't think we have the guarantee we
> > have "f" happens before "h", because CPU can do optimistic read for "g"
> > and "h".
>
> In your example, which accesses are supposed to be to actual memory and
> which to registers? Also, remember that the memory model assumes the
Given that we use READ_ONCE() on local_v, local_v should be a memory
location but only accessed by this thread.
> hardware does not reorder loads if there is an address dependency
> between them.
>
Right, so "g" won't be reordered after "h".
> > Part of this is because when we take plain access into consideration, we
> > won't guarantee a read-from or other relations exists if compiler
> > optimization happens.
> >
> > Maybe I'm missing something subtle, but just try to think through the
> > effect of making dep; rfi as "to-r".
>
> Forget about local variables for the time being and just consider
>
> dep ; [Plain] ; rfi
>
> For example:
>
> A: r1 = READ_ONCE(x);
> y = r1;
> B: r2 = READ_ONCE(y);
>
> Should B be ordered after A? I don't see how any CPU could hope to
> excute B before A, but maybe I'm missing something.
>
Agreed.
> There's another twist, connected with the fact that herd7 can't detect
> control dependencies caused by unexecuted code. If we have:
>
> A: r1 = READ_ONCE(x);
> if (r1)
> WRITE_ONCE(y, 5);
> r2 = READ_ONCE(y);
> B: WRITE_ONCE(z, r2);
>
> then in executions where x == 0, herd7 doesn't see any control
> dependency. But CPUs do see control dependencies whenever there is a
> conditional branch, whether the branch is taken or not, and so they will
> never reorder B before A.
>
Right, because B in this example is a write, what if B is a read that
depends on r2, like in my example? Let y be a pointer to a memory
location, and initialized as a valid value (pointing to a valid memory
location) you example changed to:
A: r1 = READ_ONCE(x);
if (r1)
WRITE_ONCE(y, 5);
C: r2 = READ_ONCE(y);
B: r3 = READ_ONCE(*r2);
, then A don't have the control dependency to B, because A and B is
read+read. So B can be ordered before A, right?
> One last thing to think about: My original assessment or Björn's problem
> wasn't right, because the dep in (dep ; rfi) doesn't include control
> dependencies. Only data and address. So I believe that the LKMM
Ah, right. I was mising that part (ctrl is not in dep). So I guess my
example is pointless for the question we are discussing here ;-(
> wouldn't consider A to be ordered before B in this example even if x
> was nonzero.
Yes, and similar to my example (changing B to a read).
I did try to run my example with herd, and got confused no matter I make
dep; [Plain]; rfi as to-r (I got the same result telling me a reorder
can happen). Now the reason is clear, because this is a ctrl; rfi not a
dep; rfi.
Thanks so much for walking with me on this ;-)
Regards,
Boqun
>
> Alan
Powered by blists - more mailing lists