lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Nov 2011 20:31:22 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Gleb Natapov <gleb@...hat.com>
Cc:	fweisbec@...il.com, mingo@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: Oops while doing "echo function_graph > current_tracer"

On Mon, 2011-11-14 at 16:07 +0200, Gleb Natapov wrote:
> Hi Steven,
> 
> I get an oops with current linux.git when I am doing
> "echo function_graph > current_tracer" inside a kvm guest.
> Oopses do not contain much useful information and they are always
> different. Looks like stack corruption (at least this is what Oopses
> say when not triple faulting).
> 
> Attached is my guest kernel .config. I do not have the same problem on
> the host, but kernel config is different there.


Looking into this I see that this is an old bug. I guess this shows how
many people run function graph tracing from the guest. Or at least how
many with DEBUG_PREEMPT enabled too.

The problem is that kvm_clock_read() does a get_cpu_var(), which calls
preempt_disable() which calls add_preempt_count() which is then traced.
But this is outside the recursive protection in function_graph tracing,
and when add_preempt_count() is traced, kvm_clock_read() calls
add_preempt_count() and it gets traced again, and so on and causes a
recursive crash.

There's a few fixes we can do. For now, because this is an old bug, I
would just tell you to do this first:

echo add_preempt_count sub_preempt_count > /sys/kernel/debug/tracing/set_ftrace_notrace

But that is just a work around for you and not a complete fix.

I could just make add_preempt_count() notrace and be done with it, but
I've been reluctant to do this because there's been several times I've
actually wanted to see the add_preempt_count()s being traced.

I could also make a get_cpu_var_notrace() version that kvm_clock_read()
could use. This is the solution that I would most likely want to do as a
permanent one.

Then finally I could force the function_graph tracer to have recursion
protection and when it recurses, it just exits out nicely. I think I'll
add that with a WARN_ON_ONCE(). Without the warning, if a recursion
slips in, we'll have overhead of the recursion on top of the overhead of
the tracing making it worse than what it already is. Function graph
tracing is the most invasive tracer, and I want to speed it up if
possible (I already have ideas on doing so) and I do not want to make it
slower.

Thanks!

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ