[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120216121807.GA3426@Krystal>
Date: Thu, 16 Feb 2012 07:18:07 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Mathieu Desnoyers <compudj@...stal.dyndns.org>,
linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org,
josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
eric.dumazet@...il.com, darren@...art.com, fweisbec@...il.com,
patches@...aro.org
Subject: Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU
implementation
* Peter Zijlstra (peterz@...radead.org) wrote:
> On Thu, 2012-02-16 at 06:00 -0500, Mathieu Desnoyers wrote:
> > This brings the following question then: which memory barriers, in the
> > scheduler activity, act as full memory barriers to migrated threads ? I
> > see that the rq_lock is taken, but this lock is permeable in one
> > direction (operations can spill into the critical section). I'm probably
> > missing something else, but this something else probably needs to be
> > documented somewhere, since we are doing tons of assumptions based on
> > it.
>
> A migration consists of two context switches, one switching out the task
> on the old cpu, and one switching in the task on the new cpu.
If we have memory barriers on both context switches, then we should be
good. If just fail to see them.
> Now on x86 all the rq->lock grabbery is plenty implied memory barriers
> to make anybody happy.
Indeed. Outside of x86 is far less certain though.
> But I think, since there's guaranteed order (can't switch to before
> switching from) you can match the UNLOCK from the switch-from to the
> LOCK from the switch-to to make your complete MB.
>
> Does that work or do we need more?
Hrm, I think we'd need a little more than just lock/unlock ordering
guarantees. Let's consider the following, where the stores would be
expected to be seen as "store A before store B" by CPU 2
CPU 0 CPU 1 CPU 2
load B, smp_rmb, load A in loop,
expecting that when updated A is
observed, B is always observed as
updated too.
store A
(lock is permeable:
outside can leak
inside)
lock(rq->lock)
-> migration ->
unlock(rq->lock)
(lock is permeable:
outside can leak inside)
store B
As we notice, the "store A" could theoretically be still pending in
CPU 0's write buffers when store B occurs, because the memory barrier
associated with "lock" only has acquire semantic (so memory operations
prior to the lock can leak into the critical section).
Given that the unlock(rq->lock) on CPU 0 is not guaranteed to happen in
a bound time-frame, no memory barrier with release semantic can be
assumed to have happened. This could happen if we have a long critical
section holding the rq->lock on CPU 0, and a much shorter critical
section on CPU 1.
Does that make sense, or should I get my first morning coffee ? :)
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists