[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141121170151.GC30603@home.goodmis.org>
Date: Fri, 21 Nov 2014 12:01:51 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Tejun Heo <tj@...nel.org>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Jones <davej@...hat.com>, Don Zickus <dzickus@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...capital.net>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: frequent lockups in 3.18rc4
On Fri, Nov 21, 2014 at 11:25:06AM -0500, Tejun Heo wrote:
>
> * Static percpu areas wouldn't trigger fault lazily. Note that this
> is not necessarily because the first percpu chunk which contains the
> static area is embedded inside the kernel linear mapping. Depending
> on the memory layout and boot param, percpu allocator may choose to
> map the first chunk in vmalloc space too; however, this still works
> out fine because at that point there are no other page tables and
> the PUD entries covering the first chunk is faulted in before other
> pages tables are copied from the kernel one.
That sounds correct.
>
> * NMI used to be a problem because vmalloc fault handler couldn't
> safely nest inside NMI handler but this has been fixed since and it
> should work fine from NMI handlers now.
Right. Of course "should work fine" does not excatly mean "will work fine".
>
> * Function tracers are problematic because they may end up nesting
> inside themselves through triggering a vmalloc fault while accessing
> dynamic percpu memory area. This may lead to recursive locking and
> other surprises.
The function tracer infrastructure now has a recursive check that happens
rather early in the call. Unless the registered OPS specifically states
it handles recursions (FTRACE_OPS_FL_RECUSION_SAFE), ftrace will add the
necessary recursion checks. If a registered OPS lies about being recusion
safe, well we can't stop suicide.
Looking at kernel/trace/trace_functions.c: function_trace_call() which is
registered with RECURSION_SAFE, I see that the recursion check is done
before the per_cpu_ptr() call to the dynamically allocated per_cpu data.
It looks OK, but...
Oh! but if we trace the page fault handler, and we fault here too
we just nuked the cr2 register. Not good.
-- Steve
>
> Are there other cases where the lazy vmalloc faults can break things?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists