[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180712120100.GA7404@andrea>
Date: Thu, 12 Jul 2018 14:01:00 +0200
From: Andrea Parri <andrea.parri@...rulasolutions.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Will Deacon <will.deacon@....com>,
Alan Stern <stern@...land.harvard.edu>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
Boqun Feng <boqun.feng@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Nicholas Piggin <npiggin@...il.com>,
Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and
remove it for ordinary release/acquire
On Thu, Jul 12, 2018 at 01:52:49PM +0200, Andrea Parri wrote:
> On Thu, Jul 12, 2018 at 09:40:40AM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> > > Simplicity is the eye of the beholder. From my POV (LKMM maintainer), the
> > > simplest solution would be to get rid of rfi-rel-acq and unlock-rf-lock-po
> > > (or its analogous in v3) all together:
> >
> > <snip>
> >
> > > Among other things, this would immediately:
> > >
> > > 1) Enable RISC-V to use their .aq/.rl annotations _without_ having to
> > > "worry" about tso or release/acquire fences; IOW, this will permit
> > > a partial revert of:
> > >
> > > 0123f4d76ca6 ("riscv/spinlock: Strengthen implementations with fences")
> > > 5ce6c1f3535f ("riscv/atomic: Strengthen implementations with fences")
> >
> > But I feel this goes in the wrong direction. This weakens the effective
> > memory model, where I feel we should strengthen it.
> >
> > Currently PowerPC is the weakest here, and the above RISC-V changes
> > (reverts) would make RISC-V weaker still.
> >
> > Any any effective weakening makes me very uncomfortable -- who knows
> > what will come apart this time. This memory ordering stuff causes
> > horrible subtle bugs at best.
>
> Indeed, what I was suggesting above is a weaking of the current model
> (and I agree: I wouldn't say that bugs in this context are nice ;-).
>
> These changes would affect a specific area: (IMO,) the examples we've
> been considering here aren't for the faint-hearted ;-) and as Daniel
> already suggested, everything would again be "nice and neat", if this
> was all about locking/if every thread used lock-synchronization.
>
>
> >
> > > 2) Resolve the above mentioned controversy (the inconsistency between
> > > - locking operations and atomic RMWs on one side, and their actual
> > > implementation in generic code on the other), thus enabling the use
> > > of LKMM _and_ its tools for the analysis/reviewing of the latter.
> >
> > This is a good point; so lets see if there is something we can do to
> > strengthen the model so it all works again.
> >
> > And I think if we raise atomic*_acquire() to require TSO (but ideally
> > raise it to RCsc) we're there.
> >
> > The TSO archs have RCpc load-acquire and store-release, but fully
> > ordered atomics. Most of the other archs have smp_mb() everything, with
> > the exception of PPC, ARM64 and now RISC-V.
> >
> > PPC has the RCpc TSO fence LWSYNC, ARM64 has the RCsc
> > load-acquire/store-release. And RISC-V has a gazillion of options IIRC.
> >
> >
> > So ideally atomic*_acquire() + smp_store_release() will be RCsc, and is
> > with the notable exception of PPC, and ideally RISC-V would be RCsc
> > here. But at the very least it should not be weaker than PPC.
> >
> > By increasing atomic*_acquire() to TSO we also immediately get the
> > proposed:
> >
> > P0()
> > {
> > WRITE_ONCE(X, 1);
> > spin_unlock(&s);
> > spin_lock(&s);
> > WRITE_ONCE(Y, 1);
> > }
> >
> > P1()
> > {
> > r1 = READ_ONCE(Y);
> > smp_rmb();
> > r2 = READ_ONCE(X);
> > }
> >
> > behaviour under discussion; because the spin_lock() will imply the TSO
> > ordering.
>
> You mean: "when paired with a po-earlier release to the same memory
> location", right? I am afraid that neither arm64 nor riscv current
> implementations would ensure "(r1 == 1 && r2 == 0) forbidden" if we
> removed the po-earlier spin_unlock()...
>
> AFAICT, the current implementation would work with that release: as
> you remarked above, arm64 release->acquire is SC; riscv has an rw,w
> fence in its spin_unlock() (hence an w,w fence between the stores),
> or it could have a .tso fence ...
>
> But again, these are stuble patterns, and my guess is that several/
> most kernel developers really won't care about such guarantees (and
> if some will do, they'll have the tools to figure out what they can
> actually rely on ...)
>
> OTOH (as I pointed out earlier) the strengthening we're configuring
> will prevent some arch. (riscv being just the example of today!) to
> go "full RCsc", and this will inevitably "complicate" both the LKMM
"full RCpc"
Andrea
> and the reviewing process of related changes (atomics, locking, ...;
> c.f., this debate), apparently, just because you ;-) want to "care"
> about these guarantees.
>
> Not yet convinced ... :/
>
> Andrea
>
>
> >
> > And note that this retains regular RCpc ACQUIRE for smp_load_acquire()
> > and associated primitives -- as they have had since their introduction
> > not too long ago.
Powered by blists - more mailing lists