linux-kernel - Re: [PATCH V3 3/7] x86/entry: Switch the stack after error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJhGHyBKC4Xe=FTtv=+QRG1b5o=6G+vdMDP+oH83cz+29VXHBg@mail.gmail.com>
Date:   Thu, 17 Mar 2022 00:43:02 +0800
From:   Lai Jiangshan <jiangshanlai@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Lai Jiangshan <jiangshan.ljs@...group.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH V3 3/7] x86/entry: Switch the stack after error_entry() returns

On Wed, Mar 16, 2022 at 11:05 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> >  SYM_CODE_END(error_entry)
>
> So the new Changelog doesn't seem to help me much. But looking at both
> fixup_bad_iret() and sync_regs(), they both have:
>
>   __this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1


For a long time in old days (before KPTI), tss.sp0 does be the kernel
thread stack, but now is not the kernel thread stack, rather it is the
trampoline stack (or entry stack) in the cpu entry area.

And bad IRET can happen when doing IRET to return to userspace and it
is also the trampoline stack. So fixup_bad_iret() is really just moving
the lower partial pt_regs up to concat with user IRET frame head
in the entry stack.

So fixup_bad_iret() will NOT have setup pt_regs on the thread stack.
And sync_regs() is needed after fixup_bad_iret()

The patch
https://lore.kernel.org/lkml/20200817062355.2884-3-jiangshanlai@gmail.com/
tried changing fixup_bad_iret() to copy the pt_regs directly to
the kernel stack.

And in the V1 of the patchset that converting ASM to code also
tried it:
https://lore.kernel.org/lkml/20210831175025.27570-13-jiangshanlai@gmail.com/

And in the current patchset, it focuses on ASM code only. I don't think
we need to change it.  It would be much simpler to change the behavior
of fixup_bad_iret() when error_entry() is converted to C.

>
> as hard-coded destination. Now, fixup_bad_iret() sets up a complete
> ptregs there and then returns a pointer to this stack.
>
> sync_regs otoh, does a straight up pt_regs sized copy from arg0 to this
> new stack.
>
> Therefore it appears to me that doing sync_regs() after fixup_bad_iret()
> is a complete NO-OP and only confuses things further.
>
> Would not something like the below clarify things?
>
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1004,6 +1004,7 @@ SYM_CODE_START_LOCAL(error_entry)
>  .Lerror_entry_from_usermode_after_swapgs:
>         /* Put us onto the real thread stack. */
>         call    sync_regs
> +.Lerror_entry_from_usermode_after_sync_regs:
>         RET
>
>         /*
> @@ -1058,8 +1059,12 @@ SYM_CODE_START_LOCAL(error_entry)
>          */
>         leaq    8(%rsp), %rdi                   /* arg0 = pt_regs pointer */
>         call    fixup_bad_iret
> -       mov     %rax, %rdi
> -       jmp     .Lerror_entry_from_usermode_after_swapgs
> +       /*
> +        * fixup_bad_iret() will have setup pt_regs on the thread stack, and
> +        * returns a pointer to that stack exactly like sync_regs() would've
> +        * done. As such, calling sync_regs again makes no sense.
> +        */
> +       jmp     .Lerror_entry_from_usermode_after_sync_regs
>  SYM_CODE_END(error_entry)
>
>  SYM_CODE_START_LOCAL(error_return)