lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 22 Jul 2016 11:03:14 -0700
From:	Dave Hansen <dave@...1.net>
To:	linux-kernel@...r.kernel.org
Cc:	x86@...nel.org, luto@...nel.org, Dave Hansen <dave@...1.net>,
	dave.hansen@...ux.intel.com
Subject: [PATCH 3/3] x86, pkeys: allow instruction fetches in presence of pkeys


From: Dave Hansen <dave.hansen@...ux.intel.com>

Thanks to Andy Lutomirski for pointing out the potential issue
here.

Memory protection keys only affect data access.  They do not
affect instruction fetches.  So, an instruction may not be
readable, while it *is* executable.

The fault prefetch checking code directly reads instructions and
might trip over an instruction made unreadable by pkeys.  Turn
off pkeys temporarily so we can fetch the instruction.

===== Long explanation: =====

We have a long history of bugs with prefetch instructions faulting
when they should not.  So, in the slow paths of our page fault
code, we go peeking at the instructions being run when the fault
occurred to see if the fault could have been caused by a prefetch
instruction.

To do the peeking, the kernel looks at the instruction pointer
and fetches the instruction, sometimes right out of userspace.
But, protection keys may get in the way and will make the
kernel's data access fault.  The kernel will assume that the
instruction pointer was bogus and go angrily along on the way to
killing the app instead of _actually_ fetching a prefix
instruction.

Does this matter?  Probably not.  The hardware implementing
protection keys is not even publicly available yet.  But, if it
or some future processor has a prefetch bug, this will help us
properly work around it.

This makes the instruction temporarily readable so that we can do
a data access and peek at its opcode.  This operation is only
visible to the thread doing the fault.

I thought this might be painful to solve, requiring something
akin to a kernel_fpu_begin/end() pair.  After thinking about it,
I managed to convince myself that we do not need that harsh of
a solution and this simple one is OK, mostly because we never
do lazy save/restore of PKRU.

This is slow-ish.  The RDPKRU/WRPKRU/WRPKRU sequence probably
costs us dozens of cycles, plus the extra branches from the
if(OSPKE) checks.  It also does that for each byte of the
instructions that it fetches, which is a bit naughty.  But, this
is a slow path already, so I haven't tried to optimize it at all.

Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Andy Lutomirski <luto@...nel.org>
---

 b/arch/x86/mm/fault.c |   51 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff -puN arch/x86/mm/fault.c~pkeys-900-fix-prefetch arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-900-fix-prefetch	2016-07-22 10:52:33.274364481 -0700
+++ b/arch/x86/mm/fault.c	2016-07-22 10:52:33.277364616 -0700
@@ -20,6 +20,7 @@
 #include <asm/pgalloc.h>		/* pgd_*(), ...			*/
 #include <asm/kmemcheck.h>		/* kmemcheck_*(), ...		*/
 #include <asm/fixmap.h>			/* VSYSCALL_ADDR		*/
+#include <asm/fpu/internal.h>		/* for use_eager_fpu() checks	*/
 #include <asm/vsyscall.h>		/* emulate_vsyscall		*/
 #include <asm/vm86.h>			/* struct vm86			*/
 #include <asm/mmu_context.h>		/* vma_pkey()			*/
@@ -76,6 +77,52 @@ static nokprobe_inline int kprobes_fault
 }
 
 /*
+ * Memory protection keys only affect data access.  They do not
+ * affect instruction fetches.  So, an instruction may not be
+ * readable, while it *is* executable.  This makes the
+ * instruction temporarily readable so that we can do a data
+ * access and peek at its opcode.
+ */
+static
+int probe_insn_opcode(void *insn_address, unsigned char *ret_opcode)
+{
+	int ret;
+	u32 saved_pkru = read_pkru();
+
+	/*
+	 * Clear out all of the access/write-disable bits in
+	 * PKRU.  This ensures that pkeys will not block access
+	 * to @insn_address.  If no keys are access-disabled
+	 * (saved_pkru==0) avoid the cost of the PKRU writes
+	 * and the continued cost of having taken it out of its
+	 * (XSAVE) init state.
+	 *
+	 * Note also that this only affect access to user
+	 * addresses.  Kernel (supervisor) mappings are not
+	 * affected by this register.
+	 */
+	if (saved_pkru)
+		write_pkru(0);
+	/*
+	 * We normally have to be very careful with FPU registers
+	 * and preempt.  But, PKRU is different.  It is never
+	 * lazily saved/restored, so we don't have to be as
+	 * careful with this as normal FPU state.  Enforce this
+	 * assumption with the WARN_ON().
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
+		WARN_ON_ONCE(!use_eager_fpu());
+	ret = probe_kernel_address(insn_address, *ret_opcode);
+	/*
+	 * Restore PKRU to what it was.  We a
+	 */
+	if (saved_pkru)
+		write_pkru(saved_pkru);
+
+	return ret;
+}
+
+/*
  * Prefetch quirks:
  *
  * 32-bit mode:
@@ -126,7 +173,7 @@ check_prefetch_opcode(struct pt_regs *re
 		return !instr_lo || (instr_lo>>1) == 1;
 	case 0x00:
 		/* Prefetch instruction is 0x0F0D or 0x0F18 */
-		if (probe_kernel_address(instr, opcode))
+		if (probe_insn_opcode(instr, &opcode))
 			return 0;
 
 		*prefetch = (instr_lo == 0xF) &&
@@ -160,7 +207,7 @@ is_prefetch(struct pt_regs *regs, unsign
 	while (instr < max_instr) {
 		unsigned char opcode;
 
-		if (probe_kernel_address(instr, opcode))
+		if (probe_insn_opcode(instr, &opcode))
 			break;
 
 		instr++;
_

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ