[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <28a04992-a187-7ff5-9e5d-aa21165ac5cd@oracle.com>
Date: Mon, 9 Nov 2020 15:00:44 +0100
From: Alexandre Chartre <alexandre.chartre@...cle.com>
To: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
x86@...nel.org, dave.hansen@...ux.intel.com, luto@...nel.org,
peterz@...radead.org, linux-kernel@...r.kernel.org,
thomas.lendacky@....com, jroedel@...e.de
Subject: Re: [RFC][PATCH 00/24] x86/pti: Defer CR3 switch to C code
Sorry but it looks like email addresses are messed up in my emails. Our email
server has a new security "feature" which has the good idea to change external
email addresses.
I will resend the patches with the correct addresses once I've found
how to prevent this mess.
alex.
On 11/9/20 12:22 PM, Alexandre Chartre wrote:
> With Page Table Isolation (PTI), syscalls as well as interrupts and
> exceptions occurring in userspace enter the kernel with a user
> page-table. The kernel entry code will then switch the page-table
> from the user page-table to the kernel page-table by updating the
> CR3 control register. This CR3 switch is currently done early in
> the kernel entry sequence using assembly code.
>
> This RFC proposes to defer the PTI CR3 switch until we reach C code.
> The benefit is that this simplifies the assembly entry code, and make
> the PTI CR3 switch code easier to understand. This also paves the way
> for further possible projects such an easier integration of Address
> Space Isolation (ASI), or the possibilily to execute some selected
> syscall or interrupt handlers without switching to the kernel page-table
> (and thus avoid the PTI page-table switch overhead).
>
> Deferring CR3 switch to C code means that we need to run more of the
> kernel entry code with the user page-table. To do so, we need to:
>
> - map more syscall, interrupt and exception entry code into the user
> page-table (map all noinstr code);
>
> - map additional data used in the entry code (such as stack canary);
>
> - run more entry code on the trampoline stack (which is mapped both
> in the kernel and in the user page-table) until we switch to the
> kernel page-table and then switch to the kernel stack;
>
> - have a per-task trampoline stack instead of a per-cpu trampoline
> stack, so the task can be scheduled out while it hasn't switched
> to the kernel stack.
>
> Note that, for now, the CR3 switch can only be pushed as far as interrupts
> remain disabled in the entry code. This is because the CR3 switch is done
> based on the privilege level from the CS register from the interrupt frame.
> I plan to fix this but that's some extra complication (need to track if the
> user page-table is used or not).
>
> The proposed patchset is in RFC state to get early feedback about this
> proposal.
>
> The code survives running a kernel build and LTP. Note that changes are
> only for 64-bit at the moment, I haven't looked at 32-bit yet but I will
> definitively check it.
>
> Code is based on v5.10-rc3.
>
> Thanks,
>
> alex.
>
> -----
>
> Alexandre Chartre (24):
> x86/syscall: Add wrapper for invoking syscall function
> x86/entry: Update asm_call_on_stack to support more function arguments
> x86/entry: Consolidate IST entry from userspace
> x86/sev-es: Define a setup stack function for the VC idtentry
> x86/entry: Implement ret_from_fork body with C code
> x86/pti: Provide C variants of PTI switch CR3 macros
> x86/entry: Fill ESPFIX stack using C code
> x86/entry: Add C version of SWAPGS and SWAPGS_UNSAFE_STACK
> x86/entry: Add C version of paranoid_entry/exit
> x86/pti: Introduce per-task PTI trampoline stack
> x86/pti: Function to clone page-table entries from a specified mm
> x86/pti: Function to map per-cpu page-table entry
> x86/pti: Extend PTI user mappings
> x86/pti: Use PTI stack instead of trampoline stack
> x86/pti: Execute syscall functions on the kernel stack
> x86/pti: Execute IDT handlers on the kernel stack
> x86/pti: Execute IDT handlers with error code on the kernel stack
> x86/pti: Execute system vector handlers on the kernel stack
> x86/pti: Execute page fault handler on the kernel stack
> x86/pti: Execute NMI handler on the kernel stack
> x86/entry: Disable stack-protector for IST entry C handlers
> x86/entry: Defer paranoid entry/exit to C code
> x86/entry: Remove paranoid_entry and paranoid_exit
> x86/pti: Defer CR3 switch to C code for non-IST and syscall entries
>
> arch/x86/entry/common.c | 259 ++++++++++++-
> arch/x86/entry/entry_64.S | 513 ++++++++------------------
> arch/x86/entry/entry_64_compat.S | 22 --
> arch/x86/include/asm/entry-common.h | 108 ++++++
> arch/x86/include/asm/idtentry.h | 153 +++++++-
> arch/x86/include/asm/irq_stack.h | 11 +
> arch/x86/include/asm/page_64_types.h | 36 +-
> arch/x86/include/asm/paravirt.h | 15 +
> arch/x86/include/asm/paravirt_types.h | 17 +-
> arch/x86/include/asm/processor.h | 3 +
> arch/x86/include/asm/pti.h | 18 +
> arch/x86/include/asm/switch_to.h | 7 +-
> arch/x86/include/asm/traps.h | 2 +-
> arch/x86/kernel/cpu/mce/core.c | 7 +-
> arch/x86/kernel/espfix_64.c | 41 ++
> arch/x86/kernel/nmi.c | 34 +-
> arch/x86/kernel/sev-es.c | 52 +++
> arch/x86/kernel/traps.c | 61 +--
> arch/x86/mm/fault.c | 11 +-
> arch/x86/mm/pti.c | 71 ++--
> kernel/fork.c | 22 ++
> 21 files changed, 1002 insertions(+), 461 deletions(-)
>
Powered by blists - more mailing lists