lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 29 Mar 2015 14:16:05 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Denys Vlasenko <vda.linux@...glemail.com>
Cc:	Andy Lutomirski <luto@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Denys Vlasenko <dvlasenk@...hat.com>,
	Borislav Petkov <bp@...en8.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	X86 ML <x86@...nel.org>, Ingo Molnar <mingo@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>, stable <stable@...r.kernel.org>
Subject: Re: [RFC] x86, ia32entry: Use sysretl to return from sysenter

On Sun, Mar 29, 2015 at 12:07 PM, Denys Vlasenko
<vda.linux@...glemail.com> wrote:
> On Fri, Mar 27, 2015 at 10:54 PM, Andy Lutomirski <luto@...nel.org> wrote:
>> --- a/arch/x86/ia32/ia32entry.S
>> +++ b/arch/x86/ia32/ia32entry.S
>> @@ -180,28 +180,34 @@ sysenter_dispatch:
>>         testl   $_TIF_ALLWORK_MASK, ASM_THREAD_INFO(TI_flags, %rsp, SIZEOF_PTREGS)
>>         jnz     sysexit_audit
>>  sysexit_from_sys_call:
>> +       /*
>> +        * NB: sysexit is not obviously safe for 64-bit kernels -- an
>> +        * NMI between sti and sysexit has poorly specified behavior,
>> +        * and and NMI followed by an IRQ with usergs is fatal.  So
>> +        * we just pretend we're using sysexit but we really use
>> +        * sysretl instead.
>> +        *
>> +        * This code path is still called sysexit because it pairs
>> +        * with sysenter and it uses the sysenter calling convention.
>> +        */
>>         andl    $~TS_COMPAT,ASM_THREAD_INFO(TI_status, %rsp, SIZEOF_PTREGS)
>> -       /* clear IF, that popfq doesn't enable interrupts early */
>> -       andl    $~0x200,EFLAGS(%rsp)
>> -       movl    RIP(%rsp),%edx          /* User %eip */
>> -       CFI_REGISTER rip,rdx
>> +       movl    RIP(%rsp),%ecx          /* User %eip */
>> +       CFI_REGISTER rip,rcx
>>         RESTORE_RSI_RDI
>
> I think you need to replace
> RESTORE_RSI_RDI with RESTORE_RSI_RDI_RDX

I didn't because I clear it...

>
>> -       /* pop everything except ss,rsp,rflags slots */
>> -       REMOVE_PT_GPREGS_FROM_STACK 3*8
>> +       xorl    %edx,%edx
>
> Why do you clear %edx?
>

...and I clear it because the post-sysexit trampoline in the vdso pops edx.

Eventually I'd like to clean that stuff up too, but in the mean time
IMO it would be nice to keep this change self-contained and avoid
fiddling with calling conventions.  There's a long history of nasty
bugs in there.

See, for example, this long thread:

http://lkml.iu.edu/hypermail/linux/kernel/1108.2/02158.html

>
>>         xorq    %r8,%r8
>>         xorq    %r9,%r9
>>         xorq    %r10,%r10
>> -       xorq    %r11,%r11
>> -       popfq_cfi
>> +       movl    EFLAGS(%rsp),%r11d      /* User eflags */
>>         /*CFI_RESTORE rflags*/
>> -       popq_cfi %rcx                           /* User %esp */
>> -       CFI_REGISTER rsp,rcx
>>         TRACE_IRQS_ON
>>         /*
>> -        * 32bit SYSEXIT restores eip from edx, esp from ecx.
>> -        * cs and ss are loaded from MSRs.
>> +        * Sysretl works even on Intel CPUs.  Use it in preference to sysexit,
>> +        * since it avoids a dicey window with interrupts enabled.
>> +        * CS and SS are loaded from MSRs.
>
> Please do not remove the mini-doc which says what is restored from where.
> These instructions are not that obvious. I propose:
>
>          * 64bit->32bit SYSRET restores eip from ecx,
>          * eflags from r11 (but RF and VM bits are forced to 0),
>          * CS and SS are loaded from MSRs. CS and SS are loaded from MSRs.
>

I'll do something like that in v2.

>
>
> Or maybe just
>
>         ...
> sysexit_from_sys_call:
>         /*
>          * Sysretl works even on Intel CPUs.  Use it in preference to sysexit,
>          * since it avoids a dicey window with interrupts enabled.
>          */
>         jmp to sysretl_from_sys_call
>
>
>
> and remove the entire "sysexit tail" code path?

I'd rather eventually do the big cleanup I mentioned in the other thread.

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ