linux-kernel - Re: Perhaps a side effect regarding NMI returns

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1322600301.17003.84.camel@frodo>
Date:	Tue, 29 Nov 2011 15:58:21 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andi Kleen <andi@...stfloor.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"H. Peter Anvin" <hpa@...ux.intel.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Paul Turner <pjt@...gle.com>
Subject: Re: Perhaps a side effect regarding NMI returns

On Tue, 2011-11-29 at 12:36 -0800, Linus Torvalds wrote:
> On Tue, Nov 29, 2011 at 12:31 PM, Andi Kleen <andi@...stfloor.org> wrote:
> >
> > As a simple fix your proposal of forcing IRET sounds good.
> 
> We could of course use iret to return to the regular kernel stack, and
> do the schedule from there.
> 
> So instead of doing the manual stack switch, just build a fake iret
> stack on our exception stack. Subtle and somewhat complicated. I'd
> almost rather just do a blind iret, and leave the 'iret to regular
> stack' as a possible future option.

Note, the reason that I've been looking at this code, is because I'm
looking at implementing your idea to handle irets in NMIs, caused by
faults, exceptions, and the reason I really care about: debugging.

Your proposal is here:

  https://lkml.org/lkml/2010/7/14/264

But to make this work, it would be really nice if the NMI routine wasn't
convoluted with the paranoid_exit code.

For things like static_branch()/jump_label and modifying ftrace nops to
calls and back, we currently use the big hammer approach stop_machine().
This keeps another CPU from executing code that is being modified.
There's also tricks to handle NMIs that may be running on the stopped
CPUs.

But people don't like the overhead that stop_machine() causes, and I
have code that can make the modifications for ftrace with break points.
By adding a break point, syncing, then modifying the code and break
point to a new op will greatly reduce the overhead. At least the latency
will be much less.

The problem is that ftrace affects code in NMIs. We tried to not trace
NMIs, but there's so many functions that NMIs call, it ended up being a
losing battle. But if we can fix the NMI enabled on iret, we can then
use the break point scheme for both static_branch() and ftrace, and
remove the overhead of stop_machine. I think there's a possibility to
use kprobes in NMIs too, with this fix.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/