[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <553E0A52.7070400@redhat.com>
Date: Mon, 27 Apr 2015 12:07:14 +0200
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Borislav Petkov <bp@...en8.de>,
Andy Lutomirski <luto@...capital.net>
CC: Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Denys Vlasenko <vda.linux@...glemail.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Brian Gerst <brgerst@...il.com>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Oleg Nesterov <oleg@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Alexei Starovoitov <ast@...mgrid.com>,
Will Drewry <wad@...omium.org>,
Kees Cook <keescook@...omium.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86_64, asm: Work around AMD SYSRET SS descriptor attribute
issue
On 04/27/2015 10:53 AM, Borislav Petkov wrote:
> On Sun, Apr 26, 2015 at 04:39:38PM -0700, Andy Lutomirski wrote:
>>> +#define X86_BUG_CANONICAL_RCX X86_BUG(8) /* SYSRET #GPs when %RCX non-canonical */
>>
>> I think that "sysret" should appear in the name.
>
> Yeah, I thought about it too, will fix.
>
>> Oh no! My laptop is currently bug-free, and you're breaking it! :)
>
> Muahahahhahaha...
>
>>> +
>>> + /*
>>> + * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
>>> + * in kernel space. This essentially lets the user take over
>>> + * the kernel, since userspace controls RSP.
>>> + */
>>> + ALTERNATIVE "jmp 1f", "", X86_BUG_CANONICAL_RCX
>>> +
>>
>> I know it would be ugly, but would it be worth saving two bytes by
>> using ALTERNATIVE "jmp 1f", "shl ...", ...?
>>
>>> /* Change top 16 bits to be the sign-extension of 47th bit */
>>> shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
>>> sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
>>> @@ -432,6 +436,7 @@ syscall_return:
>>> cmpq %rcx, %r11
>>> jne opportunistic_sysret_failed
>
> You want to stick all 4 insns in the alternative? Yeah, it should work
> but it might even more unreadable than it is now.
>
> Btw, we can do this too:
>
> ALTERNATIVE "",
> "shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
> sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
> cmpq %rcx, %r11 \
> jne opportunistic_sysret_failed"
> X86_BUG_SYSRET_CANONICAL_RCX
>
> which will replace the 2-byte JMP with a lot of NOPs on AMD.
The instructions you want to NOP out are translated to these bytes:
2c2: 48 c1 e1 10 shl $0x10,%rcx
2c6: 48 c1 f9 10 sar $0x10,%rcx
2ca: 49 39 cb cmp %rcx,%r11
2cd: 75 5f jne 32e <opportunistic_sysret_failed>
According to http://instlatx64.atw.hu/
CPUs from both AMD and Intel are happy to eat "66,66,66,90" NOPs
with maximum throughput; more than three 66 prefixes slow decode down,
sometimes horrifically (from 3 insns per cycle to one insn per ~10 cycles).
Probably doing something like this
/* Only three 0x66 prefixes for NOP for fast decode on all CPUs */
ALTERNATIVE ".byte 0x66,0x66,0x66,0x90 \
.byte 0x66,0x66,0x66,0x90 \
.byte 0x66,0x66,0x66,0x90",
"shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx \
cmpq %rcx, %r11 \
jne opportunistic_sysret_failed"
X86_BUG_SYSRET_CANONICAL_RCX
would be better than letting ALTERNATIVE to generate 13 one-byte NOPs.
--
vda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists