[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1337959746.13348.264.camel@gandalf.stny.rr.com>
Date: Fri, 25 May 2012 11:29:06 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: "H. Peter Anvin" <hpa@...ux.intel.com>
Cc: Dave Jones <davej@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...hat.com>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: BUG - function tracing with breakpoints (was: Re: tracing
ring_buffer_resize oops.)
On Fri, 2012-05-25 at 10:31 -0400, Steven Rostedt wrote:
> Looks like we set RSP to code. Again pointing to a corrupted iretq.
> Maybe we are having nested debug stack usage, where we are hitting a
> breakpoint before setting the idt to not change the stack?
Another clue. If I do not trace the following functions:
func_ptr_is_kernel_text
kprobe_exceptions_notify
hw_breakpoint_exceptions_notify
notifier_call_chain*
it works fine.
# echo func_ptr_is_kernel_text kprobe_exceptions_notify \
hw_breakpoint_exceptions_notify notifier_call_chain* > set_ftrace_notrace
# echo function > current_tracer
works!
These notifiers are being called by the breakpoint. So perhaps the
breakpoint is still being called by int3 when it shouldn't be. It
shouldn't because we have:
dotraplinkage void __kprobes notrace do_int3(struct pt_regs *regs, long error_code)
{
#ifdef CONFIG_DYNAMIC_FTRACE
/* ftrace must be first, everything else may cause a recursive crash */
if (unlikely(modifying_ftrace_code) && ftrace_int3_handler(regs))
return;
#endif
The fix I added (but hasn't fixed it completely) was:
void arch_ftrace_update_code(int command)
{
modifying_ftrace_code++;
+ /*
+ * Make sure that all CPUs see this before we start
+ * adding breakpoints.
+ */
+ smp_mb();
ftrace_modify_all_code(command);
+ /* Finish all breakpoints before clearing */
+ smp_mb();
+
modifying_ftrace_code--;
}
This would make sense for this bug, as if modifying_ftrace_code was not
seen by other CPUs, it wouldn't go into the ftrace_int3_handler() path.
That would cause this issue. But the bug remains after the smp_mb()'s
were put in place. Although it behaves a little differently not. Maybe
there's something else I missed?
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists