[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20171211142618.rrcg5javpoinbigg@gmail.com>
Date: Mon, 11 Dec 2017 15:26:18 +0100
From: Ingo Molnar <mingo@...nel.org>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...capital.net>,
Cyrill Gorcunov <gorcunov@...nvz.org>,
Borislav Petkov <bp@...e.de>, Andi Kleen <ak@...ux.intel.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv5 2/3] x86/boot/compressed/64: Introduce
place_trampoline()
* Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> wrote:
> If a bootloader enables 64-bit mode with 4-level paging, we might need to
> switch over to 5-level paging. The switching requires the disabling
> paging. It works fine if kernel itself is loaded below 4G.
>
> But if the bootloader put the kernel above 4G (not sure if anybody does
> this), we would lose control as soon as paging is disabled, because the
> code becomes unreachable to the CPU.
>
> To handle the situation, we need a trampoline in lower memory that would
> take care of switching on 5-level paging.
>
> Apart from the trampoline code itself we also need a place to store top
> level page table in lower memory as we don't have a way to load 64-bit
> values into CR3 in 32-bit mode. We only really need 8 bytes there as we
> only use the very first entry of the page table. But we allocate a whole
> page anyway.
>
> We cannot have the code in the same page as the page table because there's
> a risk that a CPU would read the page table speculatively and get confused
> by seeing garbage. It's never a good idea to have junk in PTE entries
> visible to the CPU.
>
> We also need a small stack in the trampoline to re-enable long mode via
> long return. But stack and code can share the page just fine.
>
> This patch introduces paging_prepare() that checks if we need to enable
> 5-level paging and then finds a right spot in lower memory for the
> trampoline. Then it copies the trampoline code there and sets up the new
> top level page table for 5-level paging.
>
> At this point we do all the preparation, but don't use trampoline yet.
> It will be done in the following patch.
>
> The trampoline will be used even on 4-level paging machines. This way we
> will get better test coverage and the keep the trampoline code in shape.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> ---
> arch/x86/boot/compressed/head_64.S | 44 ++++++++++++-------------
> arch/x86/boot/compressed/pgtable.h | 18 +++++++++++
> arch/x86/boot/compressed/pgtable_64.c | 61 ++++++++++++++++++++++++++++-------
> 3 files changed, 89 insertions(+), 34 deletions(-)
> create mode 100644 arch/x86/boot/compressed/pgtable.h
>
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index fc313e29fe2c..392324004d99 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -304,20 +304,6 @@ ENTRY(startup_64)
> /* Set up the stack */
> leaq boot_stack_end(%rbx), %rsp
>
> -#ifdef CONFIG_X86_5LEVEL
> - /*
> - * Check if we need to enable 5-level paging.
> - * RSI holds real mode data and need to be preserved across
> - * a function call.
> - */
> - pushq %rsi
> - call l5_paging_required
> - popq %rsi
> -
> - /* If l5_paging_required() returned zero, we're done here. */
> - cmpq $0, %rax
> - je lvl5
> -
> /*
> * At this point we are in long mode with 4-level paging enabled,
> * but we want to enable 5-level paging.
> @@ -325,12 +311,28 @@ ENTRY(startup_64)
> * The problem is that we cannot do it directly. Setting LA57 in
> * long mode would trigger #GP. So we need to switch off long mode
> * first.
> + */
> +
> + /*
> + * paging_prepare() would set up the trampoline and check if we need to
> + * enable 5-level paging.
> *
> - * NOTE: This is not going to work if bootloader put us above 4G
> - * limit.
> + * Address of the trampoline is returned in RAX. Bit 0 is used to
> + * encode if we need to enable 5-level paging.
Hm, that encodig looks unnecessarily complicated - why not return a 128-bit
struct, where the first 64 bits get into RAX and the second into RDX?
That way RAX can be
Also, the patch looks a bit complex - could we split it into three more parts:
- First part introduces the calling of paging_prepare(), and does the LA57 return
code handling. The trampoline is not allocated and 0 is returned as the
trampoline address (it's not used)
- Second part allocates, initializes and returns the trampoline - but does not
use it yet
- Third patch uses the trampoline
This way if there's any breakage there's a very specific, dedicated patch to
bisect to.
Thanks,
Ingo
Powered by blists - more mailing lists