[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5b5e597d-7620-4a5a-9bfa-bae26f0b0fa3@intel.com>
Date: Thu, 9 May 2024 09:14:01 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Robert Gill <rtgill82@...il.com>,
"Linux regression tracking (Thorsten Leemhuis)" <regressions@...mhuis.info>,
antonio.gomez.iglesias@...ux.intel.com, daniel.sneddon@...ux.intel.com
Subject: Re: [PATCH] x86/entry_32: Move CLEAR_CPU_BUFFERS before CR3 switch
On 4/26/24 16:48, Pawan Gupta wrote:
> As the mitigation for MDS and RFDS, CLEAR_CPU_BUFFERS macro executes VERW
> instruction that is used to clear the CPU buffers before returning to user
> space. Currently, VERW is executed after the user CR3 is restored. This
> leads to vm86() to fault because VERW takes a memory operand that is not
> mapped in user page tables when vm86() syscall returns. This is an issue
> with 32-bit kernels only, as 64-bit kernels do not support vm86().
entry.S has this handy comment:
/*
* Define the VERW operand that is disguised as entry code so that
* it can be referenced with KPTI enabled. This ensure VERW can be
* used late in exit-to-user path after page tables are switched.
*/
Why isn't that working?
> Move the VERW before the CR3 switch for 32-bit kernels as a workaround.
> This is slightly less secure because there is a possibility that the data
> in the registers may be sensitive, and doesn't get cleared from CPU
> buffers. As 32-bit kernels haven't received some of the other transient
> execution mitigations, this is a reasonable trade-off to ensure that
> vm86() syscall works.
>
> Fixes: a0e2dab44d22 ("x86/entry_32: Add VERW just before userspace transition")
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218707
> Closes: https://lore.kernel.org/all/8c77ccfd-d561-45a1-8ed5-6b75212c7a58@leemhuis.info/
> Reported-by: Robert Gill <rtgill82@...il.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
> ---
> arch/x86/entry/entry_32.S | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
> index d3a814efbff6..1b9c1587f06e 100644
> --- a/arch/x86/entry/entry_32.S
> +++ b/arch/x86/entry/entry_32.S
> @@ -837,6 +837,7 @@ SYM_FUNC_START(entry_SYSENTER_32)
> jz .Lsyscall_32_done
>
> STACKLEAK_ERASE
> + CLEAR_CPU_BUFFERS
>
> /* Opportunistic SYSEXIT */
>
> @@ -881,7 +882,6 @@ SYM_FUNC_START(entry_SYSENTER_32)
> BUG_IF_WRONG_CR3 no_user_check=1
> popfl
> popl %eax
> - CLEAR_CPU_BUFFERS
Right now, this code basically does:
STACKLEAK_ERASE
/* Restore user registers and segments */
movl PT_EIP(%esp), %edx
...
SWITCH_TO_USER_CR3 scratch_reg=%eax
...
CLEAR_CPU_BUFFERS
The proposed patch is:
STACKLEAK_ERASE
+ CLEAR_CPU_BUFFERS
/* Restore user registers and segments */
movl PT_EIP(%esp), %edx
...
SWITCH_TO_USER_CR3 scratch_reg=%eax
...
- CLEAR_CPU_BUFFERS
That's a bit confusing to me. I would have expected the
CLEAR_CPU_BUFFERS to go _just_ before the SWITCH_TO_USER_CR3 and after
the user register restore.
Is there a reason it can't go there? I think only %eax is "live" with
kernel state at that point and it's only an entry stack pointer, so not
a secret.
> /*
> * Return back to the vDSO, which will pop ecx and edx.
> @@ -941,6 +941,7 @@ SYM_FUNC_START(entry_INT80_32)
> STACKLEAK_ERASE
>
> restore_all_switch_stack:
> + CLEAR_CPU_BUFFERS
> SWITCH_TO_ENTRY_STACK
> CHECK_AND_APPLY_ESPFIX
>
> @@ -951,7 +952,6 @@ restore_all_switch_stack:
>
> /* Restore user state */
> RESTORE_REGS pop=4 # skip orig_eax/error_code
> - CLEAR_CPU_BUFFERS
> .Lirq_return:
> /*
> * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on IRET core serialization
There is a working stack here, on both sides of the CR3 switch. It's
annoying to do another push/pop which won't get patched out, but this
_could_ just do:
RESTORE_REGS pop=4
CLEAR_CPU_BUFFERS
pushl %eax
SWITCH_TO_USER_CR3 scratch_reg=%eax
popl %eax
right?
That would only expose the CR3 value, which isn't a secret.
Powered by blists - more mailing lists