[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160912123502.174738ea@roar.ozlabs.ibm.com>
Date: Mon, 12 Sep 2016 12:35:02 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: Will Deacon <will.deacon@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
Alan Stern <stern@...land.harvard.edu>
Subject: Re: Question on smp_mb__before_spinlock
On Wed, 7 Sep 2016 14:51:47 +0100
Will Deacon <will.deacon@....com> wrote:
> On Wed, Sep 07, 2016 at 03:23:54PM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 07, 2016 at 10:17:26PM +1000, Nicholas Piggin wrote:
> > > It seems okay, but why not make it a special sched-only function name
> > > to prevent it being used in generic code?
> > >
> > > I would not mind seeing responsibility for the switch barrier moved to
> > > generic context switch code like this (alternative for powerpc reducing
> > > number of hwsync instructions was to add documentation and warnings about
> > > the barriers in arch dependent and independent code). And pairing it with
> > > a spinlock is reasonable.
> > >
> > > It may not strictly be an "smp_" style of barrier if MMIO accesses are to
> > > be ordered here too, despite critical section may only be providing
> > > acquire/release for cacheable memory, so maybe it's slightly more
> > > complicated than just cacheable RCsc?
> >
> > Interesting idea..
> >
> > So I'm not a fan of that raw_spin_lock wrapper, since that would end up
> > with a lot more boiler-plate code than just the one extra barrier.
> >
> > But moving MMIO/DMA/TLB etc.. barriers into this spinlock might not be a
> > good idea, since those are typically fairly heavy barriers, and its
> > quite common to call schedule() without ending up in switch_to().
> >
> > For PowerPC it works out, since there's only SYNC, no other option
> > afaik.
> >
> > But ARM/ARM64 will have to do DSB(ISH) instead of DMB(ISH). IA64 would
> > need to issue "sync.i" and mips-octeon "synciobdma".
> >
> > Will, any idea of the extra cost involved in DSB vs DMB?
>
> DSB is *much* more expensive, since it completes out-of-band communication
> such as MMIO accesses and TLB invalidation, as well as plain old memory
> accesses.
>
> The only reason we have DSB in our __switch_to code is to complete cache
> maintenance in case the task is going to migrate to another CPU; there's
> just no way to know that at the point we need to do the barrier :(
Unfortunately it's not trivial to move such barriers to migrate-time,
because the source CPU may not be involved after the task is switched
out.
This won't prevent ARM32/64 from continuing to do what it does today,
if we note that the arch must provide such barriers *either* in the
context switch lock / barrier, or in its own switch code.
Thanks,
Nick
Powered by blists - more mailing lists