[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1002011227360.4206@localhost.localdomain>
Date: Mon, 1 Feb 2010 12:42:52 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc: akpm@...ux-foundation.org, Ingo Molnar <mingo@...e.hu>,
linux-kernel@...r.kernel.org,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
dipankar@...ibm.com, josh@...htriplett.org, dvhltc@...ibm.com,
niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
Valdis.Kletnieks@...edu, dhowells@...hat.com
Subject: Re: [patch 2/3] scheduler: add full memory barriers upon task switch
at runqueue lock/unlock
On Mon, 1 Feb 2010, Mathieu Desnoyers wrote:
>
> The two event pairs we are looking at are:
>
> Pair 1)
>
> * memory accesses (load/stores) performed by user-space thread before
> context switch.
> * cpumask_clear_cpu(cpu, mm_cpumask(prev));
>
> Pair 2)
>
> * cpumask_set_cpu(cpu, mm_cpumask(next));
> * memory accessses (load/stores) performed by user-space thread after
> context switch.
So explain why does that smp_mb() in between the two _help_?
The user of this will do a
for_each_cpu(mm_cpumask)
send_IPI(cpu, smp_mb);
but that's not an atomic op _anyway_. So you're reading mm_cpumask
somewhere earlier, and doing the send_IPI later. So look at the whole
scenario 2:
cpumask_set_cpu(cpu, mm_cpumask(next));
memory accessses performed by user-space
and think about it from the perspective of another CPU. What does an
smp_mb() in between the two do?
I'll tell you - it does NOTHING. Because it doesn't matter. I see no
possible way another CPU can care, because let's assume that the other CPU
is doing that
for_each_cpu(mm_cpumask)
send_ipi(smp_mb);
and you have to realize that the other CPU needs to read that mm_cpumask
early in order to do that.
So you have this situation:
CPU1 CPU2
---- ----
cpumask_set_cpu
read mm_cpumask
smp_mb
smp_mb
user memory accessses
send_ipi
and exactly _what_ is that "smp_mb" on CPU1 protecting against?
Realize that CPU2 is not ordered (because you wanted to avoid the
locking), so the "read mm_cpumask" can happen before or after that
cpumask_set_cpu. And it can happen before or after REGARDLESS of that
smp_mb. The smp_mb doesn't make any difference to CPU2 that I can see.
So the question becomes one of "How can CPU2 care about whether CPU1 is in
the mask"? Considering that CPU2 doesn't do any locking, I don't see any
way you can get a "consistent" CPU mask _regardless_ of any smp_mb's in
there. When it does the "read mm_cpumask()" it might get the value
_before_ the cpumask_set_cpu, and it might get the value _after_, and
that's true regardless of whether there is a smp_mb there or not.
See what I'm asking for? I'm asking for why it matters that we have a
memory barrier, and why that mm_cpumask is so magical that _that_ access
matters so much.
Maybe I'm dense. But If somebody puts memory barriers in the code, I want
to know exactly what the reason for the barrier is. Memory ordering is too
subtle and non-intuitive to go by gut feel.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists