linux-kernel - Re: [PATCH V3 3/7] x86/entry: Switch the stack after error

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YjH8xJ3MKosyUl7M@hirez.programming.kicks-ass.net>
Date:   Wed, 16 Mar 2022 16:05:40 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Lai Jiangshan <jiangshanlai@...il.com>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Borislav Petkov <bp@...en8.de>,
        Lai Jiangshan <jiangshan.ljs@...group.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH V3 3/7] x86/entry: Switch the stack after error_entry()
 returns

On Tue, Mar 15, 2022 at 03:39:45PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@...group.com>
> 
> error_entry() calls sync_regs() to settle/copy the pt_regs and switches
> the stack directly after sync_regs().  But error_entry() itself is also
> a function call, the switching has to handle the return address of it
> together, which causes the work complicated and tangly.
> 
> Switching to the stack after error_entry() makes the code simpler and
> intuitive.
> 
> The behavior/logic is unchanged:
>   1) (opt) feed fixup_bad_iret() with the pt_regs pushed by ASM code
>   2) (opt) fixup_bad_iret() moves the partial pt_regs up
>   3) feed sync_regs() with the pt_regs pushed by ASM code or returned
>      by fixup_bad_iret()
>   4) sync_regs() copies the whole pt_regs to kernel stack if needed
>   5) after error_entry() and switching %rsp, it is in kernel stack with
>      the pt_regs
> 
> Changes only in calling:
>   Old code switches to copied pt_regs immediately twice in
>   error_entry() while new code switches to the copied pt_regs only
>   once after error_entry() returns.
>   It is correct since sync_regs() doesn't need to be called close
>   to the pt_regs it handles.
> 
>   Old code stashes the return-address of error_entry() in a scratch
>   register and new code doesn't stash it.
>   It relies on the fact that fixup_bad_iret() and sync_regs() don't
>   corrupt the return-address of error_entry() on the stack.  But even
>   the old code also relies on the fact that fixup_bad_iret() and
>   sync_regs() don't corrupt the return-address of themselves.
>   They are the same reliances and are assured.
> 
> After this change, error_entry() will not do fancy things with the stack
> except when in the prolog which will be fixed in the next patch ("move
> PUSH_AND_CLEAR_REGS out of error_entry").  This patch and the next patch
> can't be swapped because the next patch relies on this patch's stopping
> fiddling with the return-address of error_entry(), otherwise the objtool
> would complain.
> 
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@...group.com>
> ---
>  arch/x86/entry/entry_64.S | 16 ++++++----------
>  1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 37505331b7f1..7768cdd0c7ed 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -326,6 +326,8 @@ SYM_CODE_END(ret_from_fork)
>  .macro idtentry_body cfunc has_error_code:req
>  
>  	call	error_entry
> +	movq	%rax, %rsp			/* switch stack settled by sync_regs() */
> +	ENCODE_FRAME_POINTER
>  	UNWIND_HINT_REGS
>  
>  	movq	%rsp, %rdi			/* pt_regs pointer into 1st argument*/
> @@ -1014,14 +1016,10 @@ SYM_CODE_START_LOCAL(error_entry)
>  	/* We have user CR3.  Change to kernel CR3. */
>  	SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
>  
> +	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
>  .Lerror_entry_from_usermode_after_swapgs:
>  	/* Put us onto the real thread stack. */
> -	popq	%r12				/* save return addr in %12 */
> -	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
>  	call	sync_regs
> -	movq	%rax, %rsp			/* switch stack */
> -	ENCODE_FRAME_POINTER
> -	pushq	%r12
>  	RET
>  
>  	/*
> @@ -1053,6 +1051,7 @@ SYM_CODE_START_LOCAL(error_entry)
>  	 */
>  .Lerror_entry_done_lfence:
>  	FENCE_SWAPGS_KERNEL_ENTRY
> +	leaq	8(%rsp), %rax			/* return pt_regs pointer */
>  	RET
>  
>  .Lbstep_iret:
> @@ -1073,12 +1072,9 @@ SYM_CODE_START_LOCAL(error_entry)
>  	 * Pretend that the exception came from user mode: set up pt_regs
>  	 * as if we faulted immediately after IRET.
>  	 */
> -	popq	%r12				/* save return addr in %12 */
> -	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
> +	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
>  	call	fixup_bad_iret
> -	mov	%rax, %rsp
> -	ENCODE_FRAME_POINTER
> -	pushq	%r12
> +	mov	%rax, %rdi
>  	jmp	.Lerror_entry_from_usermode_after_swapgs
>  SYM_CODE_END(error_entry)

So the new Changelog doesn't seem to help me much. But looking at both
fixup_bad_iret() and sync_regs(), they both have:

  __this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1

as hard-coded destination. Now, fixup_bad_iret() sets up a complete
ptregs there and then returns a pointer to this stack.

sync_regs otoh, does a straight up pt_regs sized copy from arg0 to this
new stack.

Therefore it appears to me that doing sync_regs() after fixup_bad_iret()
is a complete NO-OP and only confuses things further.

Would not something like the below clarify things?

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1004,6 +1004,7 @@ SYM_CODE_START_LOCAL(error_entry)
 .Lerror_entry_from_usermode_after_swapgs:
 	/* Put us onto the real thread stack. */
 	call	sync_regs
+.Lerror_entry_from_usermode_after_sync_regs:
 	RET
 
 	/*
@@ -1058,8 +1059,12 @@ SYM_CODE_START_LOCAL(error_entry)
 	 */
 	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
 	call	fixup_bad_iret
-	mov	%rax, %rdi
-	jmp	.Lerror_entry_from_usermode_after_swapgs
+	/*
+	 * fixup_bad_iret() will have setup pt_regs on the thread stack, and
+	 * returns a pointer to that stack exactly like sync_regs() would've
+	 * done. As such, calling sync_regs again makes no sense.
+	 */
+	jmp	.Lerror_entry_from_usermode_after_sync_regs
 SYM_CODE_END(error_entry)
 
 SYM_CODE_START_LOCAL(error_return)