[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrU=WsfXcuvpf=U+_Uju6haHw0NsRFiL_rO2-M+uWaWHZA@mail.gmail.com>
Date: Thu, 23 Apr 2015 02:07:19 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Brian Gerst <brgerst@...il.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Oleg Nesterov <oleg@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Will Drewry <wad@...omium.org>,
Frédéric Weisbecker <fweisbec@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Denys Vlasenko <dvlasenk@...hat.com>,
Kees Cook <keescook@...omium.org>,
Thomas Gleixner <tglx@...utronix.de>,
"linux-tip-commits@...r.kernel.org"
<linux-tip-commits@...r.kernel.org>
Subject: Re: [tip:x86/vdso] x86/vdso32/syscall.S: Do not load __USER32_DS to %ss
On Thu, Apr 23, 2015 at 1:49 AM, Andy Lutomirski <luto@...capital.net> wrote:
>
> I'm curious whether we can somehow end up in the kernel without a
> sensible SS. What happens if we have SS = 0?
>
> Try this on for size:
>
> 1. Wine process does syscall
> 2. Context switch to any other task
> 3. Interrupt (software or hardware), which loads SS with ss0, which is
> 0 on x86_64.
> 4. Context switch back to Wine.
> 5. sysretl
>
> Would fixing this be as simple as changing this code in
> arch/x86/kernel/process.c:
>
> __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = {
> .x86_tss = {
> .sp0 = TOP_OF_INIT_STACK,
> #ifdef CONFIG_X86_32
> .ss0 = __KERNEL_DS,
>
> by moving the ifdef down a line? Even if that fixed it, it would be
> extremely fragile, but IMO it would be a good change to make
> regardless (i.e. the kernel's SS would be less unpredictable).
Confirmed with KVM on VMX: we can definitely end up in the kernel with SS == 0.
I don't know whether changing ss0 would be a good idea, though. It
would be cleaner, but it could slow down interrupt processing:
interrupt delivery would have to do an extra GDT load.
Food for thought: wouldn't this mean that we have a bug on sysretq
too? If we're in the kernel with SS == 0, we do sysretq, and then
user code does a far jump to 32-bit code, then we end up with a bogus
SS. Maybe we don't care, and reloading SS on every sysretq would
suck. We could fix it up in a kind of evil way: in do_stack_segment,
we could detect that we had SS == __USER_DS, in which case #SS should
be impossible, and just return without signalling the process. IRET
would fix up the attributes.
We just might need a stable fix, though -- I wonder if there's any bad
interaction with opportunistic sysret in 4.0. Maybe we should
benchmark ss0 = __KERNEL_DS and try it after all.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists