[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5538D389.4090007@redhat.com>
Date: Thu, 23 Apr 2015 13:12:09 +0200
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Brian Gerst <brgerst@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Oleg Nesterov <oleg@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>,
Andy Lutomirski <luto@...capital.net>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Will Drewry <wad@...omium.org>,
Frédéric Weisbecker <fweisbec@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Kees Cook <keescook@...omium.org>,
Thomas Gleixner <tglx@...utronix.de>
CC: linux-tip-commits@...r.kernel.org
Subject: Re: [tip:x86/vdso] x86/vdso32/syscall.S: Do not load __USER32_DS
to %ss
On 04/23/2015 09:37 AM, Brian Gerst wrote:
> This patch unfortunately is causing Wine to break on some applications:
>
> Unhandled exception: stack overflow in 32-bit code (0xf779bc07).
> Register dump:
> CS:0023 SS:002b DS:002b ES:002b FS:0063 GS:006b
> EIP:f779bc07 ESP:00aed60c EBP:00aed750 EFLAGS:00010216( R- -- I -A-P- )
> EAX:00000040 EBX:00000010 ECX:00aed750 EDX:00000040
> ESI:00000040 EDI:7ffd4000
> Stack dump:
> 0x00aed60c: 00aed648 f7575e5b 7bcc8000 00000000
> 0x00aed61c: 7bc7bc09 00000010 00aed750 00000040
> 0x00aed62c: 00aed750 00aed650 7bcc8000 7bc7bbdd
> 0x00aed63c: 7bcc8000 00aed6a0 00aed750 00aed738
> 0x00aed64c: 7bc7cfa9 00000011 00aed750 00000040
> 0x00aed65c: 00000020 00000000 00000000 7bc4f141
> Backtrace:
> =>0 0xf779bc07 __kernel_vsyscall+0x7() in [vdso].so (0x00aed750)
> 1 0xf7575e5b __libc_read+0x4a() in libpthread.so.0 (0x00aed648)
> 2 0x7bc7bc09 read_reply_data+0x38(buffer=0xaed750, size=0x40)
> [/home/bgerst/src/wine/wine32/dlls/ntdll/../../../dlls/ntdll/server.c:239]
> in ntdll (0x00aed648)
> 3 0x7bc7cfa9 wine_server_call+0x178() in ntdll (0x00aed738)
> 4 0x7bc840ec NtSetEvent+0x4b(handle=0x80,
> NumberOfThreadsReleased=0x0(nil))
> [/home/bgerst/src/wine/wine32/dlls/ntdll/../../../dlls/ntdll/sync.c:361]
> in ntdll (0x00aed7c8)
> 5 0x7b874afa SetEvent+0x24(handle=<couldn't compute location>)
> [/home/bgerst/src/wine/wine32/dlls/kernel32/../../../dlls/kernel32/sync.c:572]
> in kernel32 (0x00aed7e8)
> 6 0x0044e31a in battle.net launcher (+0x4e319) (0x00aed818)
> ...
>
> __kernel_vsyscall+0x7 points to "pop %ebp".
>
> This is on an AMD Phenom(tm) II X6 1055T Processor.
>
> It appears that there are some subtle differences in how sysretl works
> on AMD vs. Intel. According to the Intel docs, the SS selector and
> descriptor cache is completely reset by sysret to fixed values. The
> AMD docs however are concerning:
>
> AMD's syscall:
> SS.sel = MSR_STAR.SYSCALL_CS + 8
> SS.attr = 64-bit stack,dpl0
> SS.base = 0x00000000
> SS.limit = 0xFFFFFFFF
>
> AMD's sysret:
> SS.sel = MSR_STAR.SYSRET_CS + 8 // SS selector is changed,
> // SS base, limit, attributes unchanged.
>
> Not changing base or limit is no big deal, but not changing attributes
> could be the problem. It might be leaving the "64-bit stack"
> attribute set, for whatever that means.
>
> Reloading SS from the GDT would obviously reset any bad state left by
> sysretl. Unfortunately we may have to put it back in, and then NOP it
> out on Intel.
Please try attached tentative fix:
>From 4b9e9bdf08b092724934e62892116af6dd712128 Mon Sep 17 00:00:00 2001
From: Denys Vlasenko <dvlasenk@...hat.com>
Date: Thu, 23 Apr 2015 12:38:47 +0200
Subject: [PATCH] x86/asm/entry/32: Restore %ss before SYSRETL if necessary
AMD docs say that SYSRET32 loads %ss selector with a value from a MSR,
but *cached descriptor* of %ss is not modified.
It was observed to cause Wine crashes. Conjectured sequence of events
causing it is:
1. Wine process does syscall.
2. Context switch to any other task.
3. Interrupt (software or hardware), which loads %ss with 0.
(This happens according to both Intel and AMD docs.)
%ss cached descriptor is set to "invalid" state.
4. Context switch back to Wine.
5. sysret to 32-bit. %ss has correct value but its cached descriptor
is still invalid.
6. The very first userspace POP insn after this causes exception 12.
Fix this by checking %ss value. If it is not __KERNEL_DS,
(and it really can only be __KERNEL_DS or zero),
then load it with __KERNEL_DS.
Signed-off-by: Denys Vlasenko <dvlasenk@...hat.com>
---
arch/x86/ia32/ia32entry.S | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 0c302d0..94c0b39 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -198,6 +198,20 @@ sysexit_from_sys_call:
* with 'sysenter' and it uses the SYSENTER calling convention.
*/
andl $~TS_COMPAT,ASM_THREAD_INFO(TI_status, %rsp, SIZEOF_PTREGS)
+ /*
+ * On AMD, SYSRET32 loads %ss selector, but does not modify its
+ * cached descriptor; and in kernel, %ss can be loaded with 0,
+ * setting cached descriptor to "invalid". This has no effect on
+ * 64-bit mode, but on return to 32-bit mode, it makes stack ops fail.
+ * Fix %ss only if it's wrong: read from %ss takes ~2 cycles,
+ * write to %ss is ~40 cycles.
+ */
+ movl %ss, %ecx
+ cmpl $__KERNEL_DS, %ecx
+ je 1f
+ movl $__KERNEL_DS, %ecx
+ movl %ecx, %ss
+1:
movl RIP(%rsp),%ecx /* User %eip */
CFI_REGISTER rip,rcx
RESTORE_RSI_RDI
@@ -408,6 +421,20 @@ cstar_dispatch:
sysretl_from_sys_call:
andl $~TS_COMPAT, ASM_THREAD_INFO(TI_status, %rsp, SIZEOF_PTREGS)
RESTORE_RSI_RDI_RDX
+ /*
+ * On AMD, SYSRET32 loads %ss selector, but does not modify its
+ * cached descriptor; and in kernel, %ss can be loaded with 0,
+ * setting cached descriptor to "invalid". This has no effect on
+ * 64-bit mode, but on return to 32-bit mode, it makes stack ops fail.
+ * Fix %ss only if it's wrong: read from %ss takes ~2 cycles,
+ * write to %ss is ~40 cycles.
+ */
+ movl %ss, %ecx
+ cmpl $__KERNEL_DS, %ecx
+ je 1f
+ movl $__KERNEL_DS, %ecx
+ movl %ecx, %ss
+1:
movl RIP(%rsp),%ecx
CFI_REGISTER rip,rcx
movl EFLAGS(%rsp),%r11d
--
1.8.1.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists