[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090218165808.GA9120@elte.hu>
Date: Wed, 18 Feb 2009 17:58:08 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Suresh Siddha <suresh.b.siddha@...el.com>,
"Pallipadi, Venkatesh" <venkatesh.pallipadi@...el.com>,
Yinghai Lu <yinghai@...nel.org>, Nick Piggin <npiggin@...e.de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Jens Axboe <jens.axboe@...cle.com>,
Rusty Russell <rusty@...tcorp.com.au>,
Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org
Subject: Re: Q: smp.c && barriers (Was: [PATCH 1/4] generic-smp: remove
single ipi fallback for smp_call_function_many())
* Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Wed, 18 Feb 2009, Ingo Molnar wrote:
> >
> > But ... WRMSR should already be serializing - it is documented
> > as a serializing instruction.
>
> Hmm. I was thinking about this some more, and I think I've
> come up with an explanation.
>
> "wrmsr" probably serializes _after_ doing the write. After
> all, it's historically used for changing internal CPU state,
> so you want to do the write, and then wait until the effects
> of the write are "stable" in the core.
>
> That would explain how x2apic can use both a serializing
> instruction (wrmsr) and still effectively cause the IPI to
> happen out of sequence: the IPI can reach the destination CPU
> before the source CPU has flushed its store buffers, because
> the IPI is actually sent before serializing the core.
>
> But I would very strongly put this in the "x2apic code bug"
> column. If this is a true issue (and your TLB patch does imply
> it is), then we should just make sure that the x2apic IPI
> calls always do a 'sfence' before they happen - regardless of
> whether they are for TLB flushes or for generic kernel
> cross-calls, or for anything else.
Yeah, that makes perfect sense. IPIs are an out of band
signalling mechanism that do not listen to the normal cache
coherency rules.
Moving the smp_mb() to the x2apic specific code will also speed
up the normal mmio-mapped IPI sequence a bit. It should be an
smp_wmb() i suspect - which turns it into an sfence.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists