linux-kernel - Re: Getting rid of invalid SYSCALL RSP under Xen?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUBgBLKk2kSm3KwfmBvwjmqE40NqMxZHF6gr8WSGxhuOw@mail.gmail.com>
Date:	Sun, 26 Jul 2015 15:08:06 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Andrew Cooper <andrew.cooper3@...rix.com>
Cc:	X86 ML <x86@...nel.org>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Borislav Petkov <bp@...en8.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: Re: Getting rid of invalid SYSCALL RSP under Xen?

On Sun, Jul 26, 2015 at 12:34 PM, Andrew Cooper
<andrew.cooper3@...rix.com> wrote:
> On 23/07/2015 17:49, Andy Lutomirski wrote:
>> Hi-
>
> Hi.  Apologies for the delay.  I have been out of the office for a few days.
>
>>
>> In entry_64.S, we have:
>>
>> ENTRY(entry_SYSCALL_64)
>>     /*
>>      * Interrupts are off on entry.
>>      * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
>>      * it is too small to ever cause noticeable irq latency.
>>      */
>>     SWAPGS_UNSAFE_STACK
>>     /*
>>      * A hypervisor implementation might want to use a label
>>      * after the swapgs, so that it can do the swapgs
>>      * for the guest and jump here on syscall.
>>      */
>> GLOBAL(entry_SYSCALL_64_after_swapgs)
>>
>>     movq    %rsp, PER_CPU_VAR(rsp_scratch)
>>     movq    PER_CPU_VAR(cpu_current_top_of_stack), %rsp
>>
>> It would be really, really nice if Xen entered the SYSCALL path
>> *after* the mov to %rsp.
>>
>> Similarly, we have:
>>
>>     movq    RSP(%rsp), %rsp
>>     /* big comment */
>>     USERGS_SYSRET64
>>
>> It would be really nice if we could just mov to %rsp, swapgs, and
>> sysret, without worrying that the sysret is actually a jump on Xen.
>>
>> I suspect that making Xen stop using these code paths would actually
>> be a simplification.  On SYSCALL entry, Xen lands in
>> xen_syscall_target (AFAICT) and clearly has rsp pointing somewhere
>> valid.  Xen obligingly shoves the user RSP into the hardware RSP
>> register and jumps into the entry code.
>>
>> Is that stuff running on the normal kernel stack?
>
> Yes. The Xen ABI takes what is essentially tss->esp0 and uses that stack
> for all "switch to kernel" actions, including syscall and sysenter.
>
>>   If so, can we just
>> enter later on:
>>
>>     pushq    %r11                /* pt_regs->flags */
>>     pushq    $__USER_CS            /* pt_regs->cs */
>>     pushq    %rcx                /* pt_regs->ip */
>>
>> <-- Xen enters here
>>
>>     pushq    %rax                /* pt_regs->orig_ax */
>>     pushq    %rdi                /* pt_regs->di */
>>     pushq    %rsi                /* pt_regs->si */
>>     pushq    %rdx                /* pt_regs->dx */
>
> This looks plausible, and indeed preferable to the current doublestep
> with undo_xen_syscall.
>
> One slight complication is that xen_enable_syscall() will want to
> special case register_callback() to not set CALLBACKF_mask_events, as
> the entry point is now after re-enabling interrupts.

I wouldn't do that.  Let's just move the ENABLE_INTERRUPTS a few
instructions later even on native -- I want to do that anyway.

>
>>
>> For SYSRET, I think the way to go is to force Xen to always use the
>> syscall slow path.  Instead, Xen could hook into
>> syscall_return_via_sysret or even right before the opportunistic
>> sysret stuff.  Then we could remove the USERGS_SYSRET hooks entirely.
>>
>> Would this work?
>
> None of the opportunistic sysret stuff makes sense under Xen.  The path
> will inevitably end up in xen_iret making a hypercall.  Short circuiting
> all of this seems like a good idea, especially if it allows for the
> removal of the UERGS_SYSRET.

Doesn't Xen decide what to do based on VGCF_IN_SYSCALL?  Maybe Xen
should have its own opportunistic VGCF_IN_SYSCALL logic.

Hmm, maybe some of this would be easier to think about if, rather than
having a paravirt op, we could have:

ALTERNATIVE "", "jmp xen_pop_things_and_iret", X86_FEATURE_XEN

Or just IF_XEN("jmp ...");

As a practical matter, x86_64 has native and Xen -- I don't think
there's any other paravirt platform that needs the asm hooks.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/