[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190916115254.GB29216@tardis>
Date: Mon, 16 Sep 2019 19:52:54 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
Andrea Parri <parri.andrea@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Nicholas Piggin <npiggin@...il.com>,
"Paul E. McKenney" <paulmck@...ux.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>,
Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Documentation for plain accesses and data races
On Mon, Sep 16, 2019 at 01:17:53PM +0800, Boqun Feng wrote:
> Hi Alan,
>
> I spend some time reading this, really helpful! Thanks.
>
> Please see comments below:
>
> On Fri, Sep 06, 2019 at 02:11:29PM -0400, Alan Stern wrote:
> [...]
> > If two memory accesses aren't concurrent then one must execute before
> > the other. Therefore the LKMM decides two accesses aren't concurrent
> > if they can be connected by a sequence of hb, pb, and rb links
> > (together referred to as xb, for "executes before"). However, there
> > are two complicating factors.
> >
> > If X is a load and X executes before a store Y, then indeed there is
> > no danger of X and Y being concurrent. After all, Y can't have any
> > effect on the value obtained by X until the memory subsystem has
> > propagated Y from its own CPU to X's CPU, which won't happen until
> > some time after Y executes and thus after X executes. But if X is a
> > store, then even if X executes before Y it is still possible that X
> > will propagate to Y's CPU just as Y is executing. In such a case X
> > could very well interfere somehow with Y, and we would have to
> > consider X and Y to be concurrent.
> >
> > Therefore when X is a store, for X and Y to be non-concurrent the LKMM
> > requires not only that X must execute before Y but also that X must
> > propagate to Y's CPU before Y executes. (Or vice versa, of course, if
> > Y executes before X -- then Y must propagate to X's CPU before X
> > executes if Y is a store.) This is expressed by the visibility
> > relation (vis), where X ->vis Y is defined to hold if there is an
> > intermediate event Z such that:
> >
> > X is connected to Z by a possibly empty sequence of
> > cumul-fence links followed by an optional rfe link (if none of
> > these links are present, X and Z are the same event),
> >
>
> I wonder whehter we could add an optional ->coe or ->fre between X and
> the possibly empty sequence of cumul-fence, smiliar to the definition
> of ->prop. This makes sense because if we have
>
> X ->coe X' (and X' in on CPU C)
>
> , X must be already propagated to C before X' executed, according to our
> operation model:
>
> "... In particular, it must arrange for the store to be co-later than
> (i.e., to overwrite) any other store to the same location which has
> already propagated to CPU C."
>
> In other words, we can define ->vis as:
>
> let vis = prop ; ((strong-fence ; [Marked] ; xbstar) | (xbstar & int))
>
Hmm.. so the problem with this approach is that the (xbstar & int) part
doesn't satisfy the requirement of visibility... i.e.
X ->prop Z ->(xbstar & int) Y
may not guarantee when Y executes, X is already propagated to Y's CPU.
Lemme think more on this...
Regards,
Boqun
> , and for this document, reference the "prop" section in
> explanation.txt. This could make the model simple (both for description
> and explanation), and one better thing is that we won't need commit in
> Paul's dev branch:
>
> c683f2c807d2 "tools/memory-model: Fix data race detection for unordered store and load"
>
> , and data race rules will look more symmetrical ;-)
>
> Thoughts? Or am I missing something subtle here?
>
> Regards,
> Boqun
>
> > and either:
> >
> > Z is connected to Y by a strong-fence link followed by a
> > possibly empty sequence of xb links,
> >
> > or:
> >
> > Z is on the same CPU as Y and is connected to Y by a possibly
> > empty sequence of xb links (again, if the sequence is empty it
> > means Z and Y are the same event).
> >
> > The motivations behind this definition are straightforward:
> >
> > cumul-fence memory barriers force stores that are po-before
> > the barrier to propagate to other CPUs before stores that are
> > po-after the barrier.
> >
> > An rfe link from an event W to an event R says that R reads
> > from W, which certainly means that W must have propagated to
> > R's CPU before R executed.
> >
> > strong-fence memory barriers force stores that are po-before
> > the barrier, or that propagate to the barrier's CPU before the
> > barrier executes, to propagate to all CPUs before any events
> > po-after the barrier can execute.
> >
> > To see how this works out in practice, consider our old friend, the MP
> > pattern (with fences and statement labels, but without the conditional
> > test):
> >
> > int buf = 0, flag = 0;
> >
> > P0()
> > {
> > X: WRITE_ONCE(buf, 1);
> > smp_wmb();
> > W: WRITE_ONCE(flag, 1);
> > }
> >
> > P1()
> > {
> > int r1;
> > int r2 = 0;
> >
> > Z: r1 = READ_ONCE(flag);
> > smp_rmb();
> > Y: r2 = READ_ONCE(buf);
> > }
> >
> > The smp_wmb() memory barrier gives a cumul-fence link from X to W, and
> > assuming r1 = 1 at the end, there is an rfe link from W to Z. This
> > means that the store to buf must propagate from P0 to P1 before Z
> > executes. Next, Z and Y are on the same CPU and the smp_rmb() fence
> > provides an xb link from Z to Y (i.e., it forces Z to execute before
> > Y). Therefore we have X ->vis Y: X must propagate to Y's CPU before Y
> > executes.
> >
> [...]
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists