[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141121023310.GB30603@home.goodmis.org>
Date: Thu, 20 Nov 2014 21:33:11 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Tejun Heo <tj@...nel.org>
Cc: Andy Lutomirski <luto@...capital.net>,
Thomas Gleixner <tglx@...utronix.de>,
Frederic Weisbecker <fweisbec@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Jones <davej@...hat.com>, Don Zickus <dzickus@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: frequent lockups in 3.18rc4
On Thu, Nov 20, 2014 at 06:39:20PM -0500, Tejun Heo wrote:
> On Thu, Nov 20, 2014 at 03:08:03PM -0800, Andy Lutomirski wrote:
> > > So, for now, all we need is adding nmi check in percpu accessors,
> > > right?
> > >
> >
> > What's the issue with nmi? Page faults are supposed to nest correctly
> > inside nmi, right?
>
> Thought they couldn't. Looking at the trace that Frederic linked, it
> looks like straight-out tracing function recursion due to an
> unexpected fault while holding a lock. I don't think this can be
> annotated from percpu accessor side. There's nothing special about
> the context. :(
There use to be issues with page faults in NMI. One was that the iretq
from the page fault handler would re-enable NMIs, and if another NMI triggered
then it would stomp all over the stack of the initial NMI. But my tripple
copy of the NMI stack frame solved that. You can read all about it here:
http://lwn.net/Articles/484932/
The second bug was that if an NMI triggered right after a page fault, and
it had a page fault, the content of the cr2 register (faulting address)
would be lost for the page fault that was preempted by the NMI.
This too was solved by using (queue irony) using per_cpu variables.
Now I'm hoping that kernel boot time per_cpu variables never take any
faults, otherwise we are all f*cked!
>
> Does this matter for anybody other than tracers? Ultimately, the
> solution would be removing the vmalloc area faulting as Thomas
> suggested.
I don't know, but per_cpu variables are rather special and used all
over the place. Most other vmalloc code isn't as used as per_cpu is.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists