lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 20 Jun 2016 16:39:53 -0700
From:	Andy Lutomirski <luto@...nel.org>
To:	x86@...nel.org, linux-kernel@...r.kernel.org
Cc:	Borislav Petkov <bp@...en8.de>, Andy Lutomirski <luto@...nel.org>,
	Pedro Alves <palves@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Kees Cook <keescook@...omium.org>
Subject: [PATCH v3 2/3] x86/signal: Rewire the restart_block() syscall to have a constant nr

Suppose a 64-bit task A traces a 32-bit task B.  B makes a syscall
that uses ERESTART_RESTARTBLOCK and gets a signal.  A catches
syscall exit, snapshots B's regs, changes the regs, and resumes.
Then A restores the snapshot of B's regs.

A expects B to resume where it left off when snapshotted, which means
that B should execute sys_restart_block().  We have a big set of hacks
in place that make this work in some cases, but in this particular
case, the hacks fall apart.  When A restores regs, because A is
64-bit, ptrace() will *not* set TS_I386_REGS_POKED.  But, because B
isn't stopped in a syscall, TS_COMPAT won't be set either.  As a
result, the current code will set nr=219 (the 64-bit
__NR_restart_syscall) and then promptly invoke madvise() (the 32-bit
syscall with nr==219).

This patch fixes the bug and simplifies the code simultaneous by
simply wiring up nr==380 to refer to sys_restart_block() in all cases.

Cc: Pedro Alves <palves@...hat.com>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Kees Cook <keescook@...omium.org>
Signed-off-by: Andy Lutomirski <luto@...nel.org>
---
 arch/x86/entry/syscalls/syscall_32.tbl |  2 ++
 arch/x86/entry/syscalls/syscall_64.tbl |  3 +++
 arch/x86/kernel/signal.c               | 34 ++++++++--------------------------
 3 files changed, 13 insertions(+), 26 deletions(-)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 4cddd17153fb..9fc9272d0c41 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -386,3 +386,5 @@
 377	i386	copy_file_range		sys_copy_file_range
 378	i386	preadv2			sys_preadv2			compat_sys_preadv2
 379	i386	pwritev2		sys_pwritev2			compat_sys_pwritev2
+# Same as restart_syscall but with the same nr for all ABIs.
+380	i386	restart_syscall2	sys_restart_syscall
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 555263e385c9..f074952eaad8 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -336,6 +336,9 @@
 327	64	preadv2			sys_preadv2
 328	64	pwritev2		sys_pwritev2
 
+# Same as restart_syscall but with the same nr for all ABIs.
+380	common	restart_syscall2	sys_restart_syscall
+
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
 # for native 64-bit operation.
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 6b952e1d8db8..b9f5133867f3 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -761,35 +761,17 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
 {
 	/*
-	 * This function is fundamentally broken as currently
-	 * implemented.
-	 *
-	 * The idea is that we want to trigger a call to the
-	 * restart_block() syscall and that we want in_ia32_syscall(),
-	 * in_x32_syscall(), etc.  to match whatever they were in the
-	 * syscall being restarted.  We assume that the syscall
-	 * instruction at (regs->ip - 2) matches whatever syscall
-	 * instruction we used to enter in the first place.
-	 *
-	 * The problem is that we can get here when ptrace pokes
-	 * syscall-like values into regs even if we're not in a syscall
-	 * at all.
-	 *
-	 * For now, we maintain historical behavior and guess based on
-	 * stored state.  We could do better by saving the actual
-	 * syscall arch in restart_block or (with caveats on x32) by
-	 * checking if regs->ip points to 'int $0x80'.  The current
-	 * behavior is incorrect if a tracer has a different bitness
-	 * than the tracee.
+	 * We can't reliably distinguish between restarting an i386
+	 * syscall and an x86_64 syscall without decoding the
+	 * instruction at regs->ip -= 2.  Rather than trying, restart
+	 * using __NR_restart_syscall2, which works regardless of ABI.
+	 * We still want to preserve the x32 state, but we can do that
+	 * directly.
 	 */
-#ifdef CONFIG_IA32_EMULATION
-	if (current_thread_info()->status & (TS_COMPAT|TS_I386_REGS_POKED))
-		return __NR_ia32_restart_syscall;
-#endif
 #ifdef CONFIG_X86_X32_ABI
-	return __NR_restart_syscall | (regs->orig_ax & __X32_SYSCALL_BIT);
+	return __NR_restart_syscall2 | (regs->orig_ax & __X32_SYSCALL_BIT);
 #else
-	return __NR_restart_syscall;
+	return __NR_restart_syscall2;
 #endif
 }
 
-- 
2.5.5

Powered by blists - more mailing lists