lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Jul 2015 14:48:13 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Andy Lutomirski <luto@...capital.net>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Willy Tarreau <w@....eu>, Borislav Petkov <bp@...en8.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Brian Gerst <brgerst@...il.com>
Subject: Re: Dealing with the NMI mess

On Thu, Jul 23, 2015 at 2:31 PM, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> Let me get this straight. The idea is in the #DB handler to detect that
> it was triggered in NMI context, and if so, simply disarm that
> breakpoint permanently, right?

No, for simplicity, I'd make it cover not just NMI code, but any
"kernel code with interrupts disabled".

Because that's the test we'd use for "use ret instead of iret".

And that wider test is exactly because it's so damn hard to get the
exact instruction boundaries right. Let's *not* go down the path
(again) of having to get the whole %rip range and "magic stack pointer
values" etc.

Make it simple and completely unambiguous. The rule really would be:

 - if we return to kernel space and interrupts are disabled, we will
use "ret" rather than "iret"

   Hard rule. Simple. Straightforward. No random %rip values. No
random %rsp values. NO CRAP.

 - but because we use "ret" rather than "iret" we can't get RF
semantics, it means that #DB is special. RF is supposed to make us
make forward progress

   So for that reason, #DB just says "if the breakpoint happened
during that interrupts-ff reghion, I will clear %dr7 to guarantee
forward progress"

So those would be the two main rules. Very simple, and avoiding all nasty cases.

Now, I'd be willing to then hide the "oops, we clear dr7 very
agrressively" issue by having a few additional _heuristics_. But I
call them "heuristics" because unlike the current NMI nesting games,
they aren't about core stability. They are about "ok, maybe somebody
wants to trigger those faults, and we'll be _nice_ and try to make it
easy for them", but nothing more.

So for example, if that "#DB clears %dr7" happened, it sounds easy to
set _TIF_USER_WORK_MASK, and just force %dr7 to be re-loaded from a
cached value, so that if we disabled things because of some user stack
trace access, it will be re-enabled by the time we return to user
space. I think that sounds reasonable, but it's not something the core
low-level entry x86 assembly code needs to even care about. It's not
that level of "core", it's just being polite.

                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ