linux-kernel - Re: Dealing with the NMI mess

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVAzhE7w3BDjqRack54BLncZALbnAOZyeXHx1cSTryy4g@mail.gmail.com>
Date:	Thu, 23 Jul 2015 13:49:16 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Willy Tarreau <w@....eu>, Borislav Petkov <bp@...en8.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Brian Gerst <brgerst@...il.com>
Subject: Re: Dealing with the NMI mess

On Thu, Jul 23, 2015 at 1:38 PM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> On Thu, Jul 23, 2015 at 1:21 PM, Andy Lutomirski <luto@...capital.net> wrote:
>>
>> 2. Forbid IRET inside NMIs.  Doable but maybe not that pretty.
>>
>> We haven't considered:
>>
>> 3. Forbid faults (other than MCE) inside NMI.
>
> I'd really prefer #2. #3 depends on us getting many things right, and
> never introducing new cases in the future.
>
> #2, in contrast, seems to be fairly localized. Yes, RF is an issue,
> but returning to user space with RF clear doesn't really seem to be
> all that problematic.
>
> The point of RF is to make forward progress in the face of debug
> register faults, but I don't see what was wrong with the whole
> "disable any debug events that happen with interrupts disabled".
>
> And no, I do *not* believe that we should disable debug faults ahead
> of time. We should take them, disable them, and return with 'ret'. No
> complex "you can't put breakpoints in this region" crap, no magic
> rules, no subtle issues.
>
> I really think your "disallow #DB" is pointless. I think your "prevent
> instruction breakpoints in NMI" is wrong. Let them happen. Take them
> and disable them. Return with RT clear. Go on with your life.
>
> And the "take them and disable them" is really simple. No "am I in an
> NMI contect" thing (because that leads to the whole question about
> "what is NMI context"). That's not the real rule anyway.
>
> No, make it very simple and straightforward. Make the test be "uhhuh,
> I got a #DB in kernel mode, and interrupts were disabled - I know I'm
> going to return with "ret", so I'm just going to have to disable this
> breakpoint".
>
> Nothing clever. Nothing subtle. Nothing that needs "this range of
> instructions is magical". No.  Just a very simple rule: if the context
> we return to is kernel mode and interrupts are disabled, we're using
> 'ret', so we cannot suppress debug faults.

There are some subtleties in here.

Issue A: to return with RF clear, we need to disarm the breakpoint.
If it's limited to the duration of the NMI, that's easy.  If not, when
do we re-arm?  New prepare_exit_to_usermode hook?  Hmm, setting ti
flags during context switch may target the wrong task.

Issue B: single-step exception after SYSENTER.  The patches I just
sent fix that, though.

Issue C: #DB with invalid stack pointer (can happen due to watchpoints
during SYSCALL entry or SYSRET exit).  I guess we need to ban such
watchpoints.

Issue D: debug exception inside EFI (especially mixed-mode EFI).  We
can't return using RET, so we need to catch that case.

These issues mostly go away if we preemptively disarm DR7 early in NMI
processing and rearm it at the end.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/