[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTiml2uwYqQayTKjMN2gI3LnjVFpwxXkv8GN3McEE@mail.gmail.com>
Date: Wed, 14 Jul 2010 09:28:41 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
Steven Rostedt <rostedt@...tedt.homelinux.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Hellwig <hch@....de>, Li Zefan <lizf@...fujitsu.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Johannes Berg <johannes.berg@...el.com>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Tom Zanussi <tzanussi@...il.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Andi Kleen <andi@...stfloor.org>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
"H. Peter Anvin" <hpa@...or.com>,
Jeremy Fitzhardinge <jeremy@...p.org>,
"Frank Ch. Eigler" <fche@...hat.com>
Subject: Re: [patch 1/2] x86_64 page fault NMI-safe
On Wed, Jul 14, 2010 at 8:49 AM, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com> wrote:
>> I think you're vastly overestimating what is sane to do from an NMI
>> context. It is utterly and totally insane to assume vmalloc is available
>> in NMI.
I agree that NMI handlers shouldn't touch vmalloc space. But now that
percpu data is mapped through the VM, I do agree that other CPU's may
potentially need to touch that data, and an interrupt (including an
NMI) might be the first to create the mapping.
And that's why the faulting code needs to be interrupt-safe for the
vmalloc area.
However, it does look like the current scheduler should make it safe
to access "current->mm->pgd" from regular interrupts, so the problem
is apparently only an NMI issue. So exactly what are the circumstances
that create and expose percpu data on a CPU _without_ mapping it on
that CPU?
IOW, I'm missing some background here. I agree that at least some
basic percpu data should generally be available for an NMI handler,
but at the same time I wonder how come that basic percpu data wasn't
already mapped?
The traditional irq vmalloc race was something like:
- one CPU does a "fork()", which copies the basic kernel mappings
- in another thread a driver does a vmalloc(), which creates a _new_
mapping that didn't get copied.
- later on a switch_to() switches to the newly forked process that
missed the vmalloc initialization
- we take an interrupt for the driver that needed the new vmalloc
space, but now it doesn't have it, and we fill it in at run-time for
the (rare) race.
and I'm simply not seeing how fork() could ever race with percpu data setup.
So please just document the sequence that actually needs the page
table setup for the NMI/percpu case.
This patch (1/2) doesn't look horrible per se. I have no problems with
it. I just want to understand why it is needed.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists