lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue,  1 Sep 2015 15:41:06 -0700
From:	Andy Lutomirski <luto@...nel.org>
To:	x86@...nel.org, linux-kernel@...r.kernel.org
Cc:	Brian Gerst <brgerst@...il.com>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Borislav Petkov <bp@...en8.de>,
	Andy Lutomirski <luto@...nel.org>
Subject: [RFC 06/30] x86/sched/64: Don't save flags on context switch (reinstated)

This reinstates 2c7577a75837 ("sched/x86_64: Don't save flags on
context switch"), which was reverted in 512255a2ad2c.

Historically, Linux has always saved and restored EFLAGS across
context switches.  As far as I know, the only reason to do this is
because of the NT flag.  In particular, if something calls switch_to
with the NT flag set, then we don't want to leak the NT flag into a
different task that might try to IRET and fail because NT is set.

Before 8c7aa698baca ("x86_64, entry: Filter RFLAGS.NT on entry from
userspace"), we could run system call bodies with NT set.  This
would be a DoS or possibly privilege escalation hole if scheduling
in such a system call would leak NT into a different task.

Importantly, we don't need to worry about NT being set while
preemptible or across page faults.  The only way we can schedule due
to preemption or a page fault is in an interrupt entry that nests
inside the SYSENTER prologue.  The CPU will clear NT when entering
through an interrupt gate, so we won't schedule with NT set.

The only other interesting flags are IOPL and AC.  Allowing
switch_to to change IOPL has no effect, as the value loaded during
kernel execution doesn't matter at all except between a SYSENTER
entry and the subsequent PUSHF, and anythign that interrupts in that
window will restore IOPL on return.

If we call __switch_to with AC set, we have bigger problems.

Signed-off-by: Andy Lutomirski <luto@...nel.org>
---
 arch/x86/include/asm/switch_to.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index d7f3b3b78ac3..751bf4b7bf11 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -79,12 +79,12 @@ do {									\
 #else /* CONFIG_X86_32 */
 
 /* frame pointer must be last for get_wchan */
-#define SAVE_CONTEXT    "pushf ; pushq %%rbp ; movq %%rsi,%%rbp\n\t"
-#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp ; popf\t"
+#define SAVE_CONTEXT    "pushq %%rbp ; movq %%rsi,%%rbp\n\t"
+#define RESTORE_CONTEXT "movq %%rbp,%%rsi ; popq %%rbp\t"
 
 #define __EXTRA_CLOBBER  \
 	, "rcx", "rbx", "rdx", "r8", "r9", "r10", "r11", \
-	  "r12", "r13", "r14", "r15"
+	  "r12", "r13", "r14", "r15", "flags"
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 #define __switch_canary							  \
@@ -100,7 +100,11 @@ do {									\
 #define __switch_canary_iparam
 #endif	/* CC_STACKPROTECTOR */
 
-/* Save restore flags to clear handle leaking NT */
+/*
+ * There is no need to save or restore flags, because flags are always
+ * clean in kernel mode, with the possible exception of IOPL.  Kernel IOPL
+ * has no effect.
+ */
 #define switch_to(prev, next, last) \
 	asm volatile(SAVE_CONTEXT					  \
 	     "movq %%rsp,%P[threadrsp](%[prev])\n\t" /* save RSP */	  \
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ