linux-kernel - Re: [PATCH 05/19] x86/dumpstack: fix function graph tracing stack dump reliability issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160802191622.466f53d2@gandalf.local.home>
Date:	Tue, 2 Aug 2016 19:16:22 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Josh Poimboeuf <jpoimboe@...hat.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	"H . Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	linux-kernel@...r.kernel.org,
	Andy Lutomirski <luto@...capital.net>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Brian Gerst <brgerst@...il.com>,
	Kees Cook <keescook@...omium.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Byungchul Park <byungchul.park@....com>
Subject: Re: [PATCH 05/19] x86/dumpstack: fix function graph tracing stack
 dump reliability issues

On Tue, 2 Aug 2016 17:13:59 -0500
Josh Poimboeuf <jpoimboe@...hat.com> wrote:

> > Then we only need the fp use case when FRAME_POINTER is not set. As
> > mcount forces FRAME_POINTER, we only need to worry about the fentry
> > case.  
> 
> Hm, I'm confused.  First, I don't see where mcount forces FRAME_POINTER.

Hmm, we should probably force it generally, as gcc itself requires
mcount to be used with framepointers. -mcount can't be used without
them.

> 
> Second, I don't see why that even matters.  If mcount and frame pointers
> are enabled, then the 'fp' field of ftrace_ret_stack is needed for the
> gcc sanity check, right?  So we couldn't override 'fp', and the old
> "stateful index" version of ftrace_graph_ret_addr() would have to be
> used in the code above for reliable addresses, and we'd still have the
> same out-of-sync bug.
> 
> Or am I missing something?
> 

Or I missed something. How did we get out of sync? If we have frame
pointers, shouldn't the "return_to_handler" be seen as reliable by the
code (not that we save it as such)? That is, if the frame pointer shows
that the next function is return_to_handler, then we increment the
index into ret_stack, otherwise we simply record the return_to_handler
as a normal "unreliable" function, without any processing of it.

I guess I don't actually understand how the NMI screwed it up, as
function graph doesn't trace "do_nmi()" itself nor anything before that.
I'm guessing it really got out of sync because there's a
"return_to_handler" in the stack that wasn't really called (not a frame
pointer). The ftrace_graph_ret_addr() will shift the index currently
regardless if the return_to_handler found is part of a stack frame, or
just left over in the stack. THAT is why I think it got out of sync.

-- Steve