linux-kernel - Re: [GIT PULL rcu/next] RCU commits for 4.13

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170628235412.GB3721@linux.vnet.ibm.com>
Date:   Wed, 28 Jun 2017 16:54:12 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrea Parri <parri.andrea@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        priyalee.kushwaha@...el.com, drozdziak1@...il.com,
        Arnd Bergmann <arnd@...db.de>, ldr709@...il.com,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Josh Triplett <josh@...htriplett.org>,
        Nicolas Pitre <nico@...aro.org>,
        Krister Johansen <kjlx@...pleofstupid.com>,
        Vegard Nossum <vegard.nossum@...cle.com>, dcb314@...mail.com,
        Wu Fengguang <fengguang.wu@...el.com>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Rik van Riel <riel@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...nel.org>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Jade Alglave <j.alglave@....ac.uk>
Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13

On Wed, Jun 28, 2017 at 04:16:03PM -0400, Alan Stern wrote:
> On Wed, 28 Jun 2017, Paul E. McKenney wrote:

[ . . . ]

> Yes.  Bear in mind that the PowerPC version of arch_spin_unlock_wait
> ends with smp_mb.  That already makes it a lot stronger than the
> smp_cond_load_acquire implementation on other architectures.

Fair enough!

> > So what am I missing here?
> 
> Our memory model is (deliberately) weaker than the guarantees provided
> by any of the hardware implementations.

Agreed -- the model cannot say that something is guaranteed unless -all-
the architectures provide that guarantee.  If I remember correctly, each
architecture currently running Linux is stronger in some way than at least
one other architecture.  So yes, the model has to be strictly weaker than
each of the architectures taken individually, because otherwise the model
would be making a guarantee that at least one architecture does not meet,
which would disqualify it from being a valid Linux-kernel memory model.

Plus, yes, there are a few places where the model is a bit weaker than
it needs to be in order to keep the model's complexity from exploding
beyond all reason.

>                                          So while adding smp_mb in
> front of smp_cond_load_acquire may suffice to give the desired
> semantics in many cases, it might not suffice for all architectures
> (because it doesn't suffice in the model).  In fact, we would need to 
> change the model to make it accept this idiom.

Understood.  And if Murphy is being his usual self, the change wouldn't
make the model any simpler.

> I admit not being able to point to any architectures where the
> difference matters.  But if you take this approach, you do need to
> document that for any future architectures, putting smp_mb in front of
> spin_unlock_wait must be guaranteed by the arch-specific code to have
> the desired effect.
> 
> Also, it would not be obvious to a reader _why_ putting an explicit
> smp_mb before spin_unlock_wait would make prior writes visible to later
> critical sections, since it's not obvious that spin_unlock_wait needs
> to do any writes.

Agreed, my current comments need help.  I will fix them as you suggest.

>                    Replacing the whole thing with spin_lock +
> spin_unlock would be easier to understand and wouldn't need any special 
> guarantees or explanations.

That was my initial choice, so I would look pretty silly arguing all
that much.  And I have had much better success getting the lock+unlock
definition across to people, so my experience agrees with yours on the
easy-to-understand part.

> > > If there are any places where this would add unacceptable overhead,
> > > maybe those places need some rethinking.  For instance, perhaps we
> > > could add a separate primitive that provides only release semantics.  
> > > (But would using the new primitive together with spin_unlock_wait
> > > really be significantly better than lock-unlock?)
> > 
> > At the moment, I have no idea on the relative overheads.  ;-)
> 
> I don't either.  But for architectures where spin_unlock_wait already 
> does the equivalent of load-locked + store-conditional, it's hard to 
> see why replacing it with spin_lock + spin_unlock would make it any 
> slower.

Well, the current implementations don't actually take the lock, which
means that they don't delay other lock holders quite so much.  Which might
not be a huge difference even at high levels of lock contention, and
would definitely be down in the noise for low levels of lock contention.

Digital packrat that I am, I am still carrying both sets of patches,
so can still go either way.

Linus, are you dead-set against defining spin_unlock_wait() to be
spin_lock + spin_unlock?  For example, is the current x86 implementation
of spin_unlock_wait() really a non-negotiable hard requirement?  Or
would you be willing to live with the spin_lock + spin_unlock semantics?

							Thanx, Paul