[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <792537721.5599.1484618978163.JavaMail.zimbra@efficios.com>
Date: Tue, 17 Jan 2017 02:09:38 +0000 (UTC)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Josh Triplett <josh@...htriplett.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
rostedt <rostedt@...dmis.org>,
Nicholas Miell <nmiell@...cast.net>,
Ingo Molnar <mingo@...hat.com>,
One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
Lai Jiangshan <laijs@...fujitsu.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
David Howells <dhowells@...hat.com>,
bobby prani <bobby.prani@...il.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
Shuah Khan <shuahkh@....samsung.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH] membarrier: handle nohz_full with expedited thread
registration
----- On Jan 16, 2017, at 6:50 PM, Linus Torvalds torvalds@...ux-foundation.org wrote:
> Why not just make the write be a "smp_store_release()", and the read
> be a "smp_load_acquire()". That guarantees a certain amount of
> ordering. The only amount that I suspect makes sense, in fact.
>
> But it's not clear what the problem is, so..
If we only use a smp_store_release() for the store to membarrier_exped,
the "unregister" (setting back to 0) would be OK, but not the "register",
as the following scenario shows:
Initial values:
A = B = 0
CPU 0 | CPU 1 (no-hz full)
|
| membarrier(REGISTER_EXPEDITED)
| (write barrier implied by store-release)
| set t->membarrier_exped = 1 (store-release imply memory barrier before store)
| store B = 1
| barrier() (compiler-level barrier)
| store A = 1
x = load A |
membarrier(CMD_SHARED) |
smp_mb() [1] |
iter. on nohz cpus |
if iter_t->membarrier_exped == 0 |
(skip) |
smp_mb() [2] |
y = load B |
Expect: if x == 1, then y == 1
CPU 0 can observe A == 1, membarrier_exped == 0, and B == 0,
because there is no memory barrier between store to
membarrier_exped and store to A on CPU 1.
What we seem to need on the registration/unregistration side
is store-acquire for registration, and store-release for
unregistration. This pairs with a load of membarrier_exped
that has both acquire and release barriers ([1] and [2] above).
> I'm not seeing how a regular fork() could possibly ever make sense to
> have the membarrier state in the newly forked process. Not that
> "fork()" is really well-defined for within a single thread anyway (it
> actually is as far as Linux is concerned, but not in POSIX, afaik).
>
> So if there is no major reason for it, I would strongly suggest that
> _if_ all this makes sense in the first place, the membarrier thing
> should just be cleared unconditionally both for exec and for
> clone/fork.
That's fine with me!
Thanks,
Mathieu
>
> Linus
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists