[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170728170113.iu3pjarfxim5nwby@hirez.programming.kicks-ass.net>
Date: Fri, 28 Jul 2017 19:01:13 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>,
Andrew Hunter <ahh@...gle.com>,
Maged Michael <maged.michael@...il.com>, gromer@...gle.com,
Avi Kivity <avi@...lladb.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [RFC PATCH v3] membarrier: expedited private command
On Fri, Jul 28, 2017 at 12:54:29PM -0400, Mathieu Desnoyers wrote:
> Scheduler-wise, it requires a memory barrier before and after context
> switching between processes (which have different mm). The memory
> barrier before context switch is already present. After context switch,
> finish_lock_switch() acts as a RELEASE barrier, which is a full memory
> barrier on all architectures except PowerPC. Add a
> smp_mb__after_unlock_lock() to promote this barrier to a full memory
> barrier. This is a no-op on all architectures but PowerPC.
Not quite complete:
Our TSO archs can do RELEASE without being a full barrier. Look at x86's
spin_unlock() being a regular STORE for example. But for those archs,
all atomics imply smp_mb and all of them have atomic ops in switch_mm()
for mm_cpumask().
>From all weakly ordered machines, only ARM64 and PPC can do RELEASE, the
rest does indeed to smp_mb(), so there the spin_unlock() is a full
barrier and we're good.
ARM64 has a very heavy barrier in switch_to(), which suffices.
PPC just removed its barrier from switch_to(), but appears to be talking
about adding something to switch_mm().
Powered by blists - more mailing lists