[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210125074506.15064-1-jiangshanlai@gmail.com>
Date: Mon, 25 Jan 2021 15:45:06 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Steven Rostedt <rostedt@...dmis.org>,
Peter Zijlstra <peterz@...radead.org>,
Lai Jiangshan <laijs@...ux.alibaba.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>
Subject: [PATCH V2] x86/entry/64: De-Xen-ify our NMI code further
From: Lai Jiangshan <laijs@...ux.alibaba.com>
The commit 929bacec21478("x86/entry/64: De-Xen-ify our NMI code") simplified
the NMI code by changing paravirt code into native code and left a comment
about "inspecting RIP instead". But until now, "inspecting RIP instead"
has not been made happened and this patch tries to complete it.
Comments in the code was from Andy Lutomirski. Thanks!
Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
---
arch/x86/entry/entry_64.S | 44 ++++++++++-----------------------------
1 file changed, 11 insertions(+), 33 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index cad08703c4ad..21f67ea62341 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1268,32 +1268,14 @@ SYM_CODE_START(asm_exc_nmi)
je nested_nmi
/*
- * Now test if the previous stack was an NMI stack. This covers
- * the case where we interrupt an outer NMI after it clears
- * "NMI executing" but before IRET. We need to be careful, though:
- * there is one case in which RSP could point to the NMI stack
- * despite there being no NMI active: naughty userspace controls
- * RSP at the very beginning of the SYSCALL targets. We can
- * pull a fast one on naughty userspace, though: we program
- * SYSCALL to mask DF, so userspace cannot cause DF to be set
- * if it controls the kernel's RSP. We set DF before we clear
- * "NMI executing".
+ * Now test if we interrupted an outer NMI that just cleared "NMI
+ * executing" and is about to IRET. This is a single-instruction
+ * window. This check does not handle the case in which we get a
+ * nested interrupt (#MC, #VE, #VC, etc.) after clearing
+ * "NMI executing" but before the outer NMI executes IRET.
*/
- lea 6*8(%rsp), %rdx
- /* Compare the NMI stack (rdx) with the stack we came from (4*8(%rsp)) */
- cmpq %rdx, 4*8(%rsp)
- /* If the stack pointer is above the NMI stack, this is a normal NMI */
- ja first_nmi
-
- subq $EXCEPTION_STKSZ, %rdx
- cmpq %rdx, 4*8(%rsp)
- /* If it is below the NMI stack, it is a normal NMI */
- jb first_nmi
-
- /* Ah, it is within the NMI stack. */
-
- testb $(X86_EFLAGS_DF >> 8), (3*8 + 1)(%rsp)
- jz first_nmi /* RSP was user controlled. */
+ cmpq $.Lnmi_iret, 8(%rsp)
+ jne first_nmi
/* This is a nested NMI. */
@@ -1438,17 +1420,13 @@ nmi_restore:
addq $6*8, %rsp
/*
- * Clear "NMI executing". Set DF first so that we can easily
- * distinguish the remaining code between here and IRET from
- * the SYSCALL entry and exit paths.
- *
- * We arguably should just inspect RIP instead, but I (Andy) wrote
- * this code when I had the misapprehension that Xen PV supported
- * NMIs, and Xen PV would break that approach.
+ * Clear "NMI executing". This leaves a window in which a nested NMI
+ * could observe "NMI executing" cleared, and a nested NMI will detect
+ * this by inspecting RIP.
*/
- std
movq $0, 5*8(%rsp) /* clear "NMI executing" */
+.Lnmi_iret: /* must be immediately after clearing "NMI executing" */
/*
* iretq reads the "iret" frame and exits the NMI stack in a
* single instruction. We are returning to kernel mode, so this
--
2.19.1.6.gb485710b
Powered by blists - more mailing lists