[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXb_A4s=ORYDEv4j1--tQsqKeHkyaKbL6cUhDa1FxpG6A@mail.gmail.com>
Date: Thu, 9 Jul 2015 18:33:59 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Andy Lutomirski <luto@...nel.org>
Cc: X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Frédéric Weisbecker <fweisbec@...il.com>,
Rik van Riel <riel@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Denys Vlasenko <vda.linux@...glemail.com>,
Borislav Petkov <bp@...en8.de>,
Kees Cook <keescook@...omium.org>,
Brian Gerst <brgerst@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC/PATCH 5/7] x86/vm86: Teach handle_vm86_trap to return to
32bit mode directly
On Thu, Jul 9, 2015 at 3:41 PM, Andy Lutomirski <luto@...capital.net> wrote:
> On Wed, Jul 8, 2015 at 12:24 PM, Andy Lutomirski <luto@...nel.org> wrote:
>> The TIF_NOTIFY_RESUME hack it was using was buggy and unsupportable.
>> vm86 mode was completely broken under ptrace, for example, because
>> we'd never make it to v8086 mode.
>>
>> This code is still a huge, scary mess, but at least it's no longer
>> tangled with the exit-to-userspace loop.
>
> This patch is incorrect. Brian, what's the ETA for your vm86 cleanup?
> If it's very soon, then I'll see if I can rely on it. If not, I'll
> have to come up with a way to fix this patch.
>
> Grr. The kernel state when handle_vm86_trap is called is absurd right
> now. Somehow we're supposed to survive do_trap, send a signal
> corresponding to the outside-vm86 state, and exit vm86 cleanly (with
> ax = 0), all before returning to user mode. I doubt these semantics
> are even intentional.
>
> This code sucks.
OK, I have a version that seems to work. It comes with a much better
selftest, too. I'll send it shortly.
Brian, would it make sense to base your work on top of it?
Now that I've looked at this stuff, if I were designing Linux support
for v8086 mode, I'd do it very differently. There wouldn't be a vm86
syscall at all. Instead you'd call sigaltstack, then raise a signal,
set X86_EFLAGS_VM, and return.
The kernel would handle X86_EFLAGS_VM being set by setting TIF_V8086
and adjusting sp0. On entry, TIF_V8086 would move the segment
registers from the hardware frame into pt_regs and, on exit, TIF_V8086
would move them back. Clearing X86_EFLAGS_VM (via ptrace, signal
delivery, or sigreturn) would sanitize the segment registers.
SYSENTER would be safe, so the SYSENTER_CS hack wouldn't be needed.
Of course, we'd lose the CPU state, so the user would have to be
careful.
And that's it. There wouldn't be any emulation -- user code could
emulate syscalls all by itself in a signal handler. Exiting v8086
mode would be straightforward -- just do anything that would raise a
signal.
Of course, this isn't at all ABI-compatible with the current turd, and
v8086 mode isn't really that useful, so this is just idle retroactive
speculation. But the TIF_V8086 trick would still be useful to let us
get rid of all the awful hacks in the trap and exit code.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists