[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170629184735.GC2393@linux.vnet.ibm.com>
Date: Thu, 29 Jun 2017 11:47:35 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Alan Stern <stern@...land.harvard.edu>,
Andrea Parri <parri.andrea@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
priyalee.kushwaha@...el.com,
Stanisław Drozd <drozdziak1@...il.com>,
Arnd Bergmann <arnd@...db.de>, ldr709@...il.com,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Josh Triplett <josh@...htriplett.org>,
Nicolas Pitre <nico@...aro.org>,
Krister Johansen <kjlx@...pleofstupid.com>,
Vegard Nossum <vegard.nossum@...cle.com>, dcb314@...mail.com,
Wu Fengguang <fengguang.wu@...el.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Rik van Riel <riel@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Luc Maranget <luc.maranget@...ia.fr>,
Jade Alglave <j.alglave@....ac.uk>
Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13
On Thu, Jun 29, 2017 at 11:17:26AM +0800, Boqun Feng wrote:
> On Wed, Jun 28, 2017 at 05:45:56PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote:
> > > On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney
> > > <paulmck@...ux.vnet.ibm.com> wrote:
> > > >
> > > > Linus, are you dead-set against defining spin_unlock_wait() to be
> > > > spin_lock + spin_unlock? For example, is the current x86 implementation
> > > > of spin_unlock_wait() really a non-negotiable hard requirement? Or
> > > > would you be willing to live with the spin_lock + spin_unlock semantics?
> > >
> > > So I think the "same as spin_lock + spin_unlock" semantics are kind of insane.
> > >
> > > One of the issues is that the same as "spin_lock + spin_unlock" is
> > > basically now architecture-dependent. Is it really the
> > > architecture-dependent ordering you want to define this as?
> > >
> > > So I just think it's a *bad* definition. If somebody wants something
> > > that is exactly equivalent to spin_lock+spin_unlock, then dammit, just
> > > do *THAT*. It's completely pointless to me to define
> > > spin_unlock_wait() in those terms.
> > >
> > > And if it's not equivalent to the *architecture* behavior of
> > > spin_lock+spin_unlock, then I think it should be descibed in terms
> > > that aren't about the architecture implementation (so you shouldn't
> > > describe it as "spin_lock+spin_unlock", you should describe it in
> > > terms of memory barrier semantics.
> > >
> > > And if we really have to use the spin_lock+spinunlock semantics for
> > > this, then what is the advantage of spin_unlock_wait at all, if it
> > > doesn't fundamentally avoid some locking overhead of just taking the
> > > spinlock in the first place?
> > >
> > > And if we can't use a cheaper model, maybe we should just get rid of
> > > it entirely?
> > >
> > > Finally: if the memory barrier semantics are exactly the same, and
> > > it's purely about avoiding some nasty contention case, I think the
> > > concept is broken - contention is almost never an actual issue, and if
> > > it is, the problem is much deeper than spin_unlock_wait().
> >
> > All good points!
> >
> > I must confess that your sentence about getting rid of spin_unlock_wait()
> > entirely does resonate with me, especially given the repeated bouts of
> > "but what -exactly- is it -supposed- to do?" over the past 18 months
> > or so. ;-)
> >
> > Just for completeness, here is a list of the definitions that have been
> > put forward, just in case it inspires someone to come up with something
> > better:
> >
> > 1. spin_unlock_wait() provides only acquire semantics. Code
> > placed after the spin_unlock_wait() will see the effects of
> > all previous critical sections, but there is no guarantees for
> > subsequent critical sections. The x86 implementation provides
> > this. I -think- that the ARM and PowerPC implementations could
> > get rid of a memory-barrier instruction and still provide this.
> >
>
> Yes, except we still need a smp_lwsync() in powerpc's
> spin_unlock_wait().
>
> And FWIW, the two smp_mb()s in spin_unlock_wait() on PowerPC exist there
> just because when Peter worked on commit 726328d92a42, we decided to let
> the fix for spin_unlock_wait() on PowerPC(i.e. commit 6262db7c088bb ) go
> into the tree first to avoid some possible conflicts. And.. I forgot to
> do the clean-up for an aquire-semantics spin_unlock_wait() later.. ;-)
>
> I could send out the necessary fix once we have a conclusion for the
> semantics part.
If we end up still having spin_unlock_wait(), I will be happy to take
you up on that.
Thanx, Paul
> Regards,
> Boqun
>
> > 2. As #1 above, but a "smp_mb();spin_unlock_wait();" provides the
> > additional guarantee that code placed before this construct is
> > seen by all subsequent critical sections. The x86 implementation
> > provides this, as do ARM and PowerPC, but it is not clear that all
> > architectures do. As Alan noted, this is an extremely unnatural
> > definition for the current memory model.
> >
> > 3. [ Just for completeness, yes, this is off the table! ] The
> > spin_unlock_wait() has the same semantics as a spin_lock()
> > followed immediately by a spin_unlock().
> >
> > 4. spin_unlock_wait() is analogous to synchronize_rcu(), where
> > spin_unlock_wait()'s "read-side critical sections" are the lock's
> > normal critical sections. This was the first definition I heard
> > that made any sense to me, but it turns out to be equivalent
> > to #3. Thus, also off the table.
> >
> > Does anyone know of any other possible definitions?
> >
> > Thanx, Paul
> >
Powered by blists - more mailing lists