lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 18 Sep 2012 18:29:35 -0700
From:	Salman Qazi <sqazi@...gle.com>
To:	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...ux.intel.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: [PATCH] x86: Don't clobber top of pt_regs in nested NMI

The nested NMI modifies the place (instruction, flags and stack)
that the first NMI will iret to.  However, the copy of registers
modified is exactly the one that is the part of pt_regs in
the first NMI.  This can change the behaviour of the first NMI.

In particular, Google's arch_trigger_all_cpu_backtrace handler
also prints regions of memory surrounding addresses appearing in
registers.  This results in handled exceptions, after which nested NMIs
start coming in.  These nested NMIs change the value of registers
in pt_regs.  This can cause the original NMI handler to produce
incorrect output.

We solve this problem by introducing an extra copy of the iret
registers that are exclusively a part of pt_regs and are not modified
elsewhere.  The downside is that the do_nmi function can no longer
change the control flow, as any values it writes to these five
registers will be discarded.

Signed-off-by: Salman Qazi <sqazi@...gle.com>
---
 arch/x86/kernel/entry_64.S |   20 +++++++++++++++++++-
 1 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 69babd8..40ddb6d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1724,6 +1724,18 @@ repeat_nmi:
 end_repeat_nmi:
 
 	/*
+	 * We went a running NMI handling routine to have a consistent
+	 * picture of register state.  This should hold true even if
+	 * there is a nested NMI.  Therefore, we let the nested NMI
+	 * play with the previous copy of these registers and leave this
+	 * new copy unmodified for do_nmi()
+	 */
+	.rept 5
+	pushq_cfi 4*8(%rsp)
+	.endr
+	CFI_DEF_CFA_OFFSET SS+8-RIP
+
+	/*
 	 * Everything below this point can be preempted by a nested
 	 * NMI if the first NMI took an exception and reset our iret stack
 	 * so that we repeat another NMI.
@@ -1771,7 +1783,13 @@ nmi_swapgs:
 nmi_restore:
 	RESTORE_ALL 8
 	/* Clear the NMI executing stack variable */
-	movq $0, 10*8(%rsp)
+	movq $0, 15*8(%rsp)
+
+	/* Pop the extra copy of iret context that was saved above
+	 * just for do_nmi()
+	 */
+	addq $5*8, %rsp
+
 	jmp irq_return
 	CFI_ENDPROC
 END(nmi)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ