lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 14 Oct 2017 09:33:53 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>, x86@...nel.org,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...capital.net>,
        Cyrill Gorcunov <gorcunov@...nvz.org>,
        Borislav Petkov <bp@...e.de>, Andi Kleen <ak@...ux.intel.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2, RFC] x86/boot/compressed/64: Handle 5-level paging
 boot if kernel is above 4G


* Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> wrote:

> This patch addresses shortcoming in current boot process on machines
> that supports 5-level paging.
> 
> If bootloader enables 64-bit mode with 4-level paging, we need to
> switch over to 5-level paging. The switching requires disabling paging.
> It works fine if kernel itself is loaded below 4G.
> 
> If bootloader put the kernel above 4G (not sure if anybody does this),
> we would loose control as soon as paging is disabled as code becomes
> unreachable.
> 
> This patch implements trampoline in lower memory to handle this
> situation.
> 
> Apart from trampoline itself we also need place to store top level page
> table in lower memory as we don't have a way to load 64-bit value into
> CR3 from 32-bit mode. We only really need 8-bytes there as we only use
> the very first entry of the page table.
> 
> place_trampoline() would choose an address for the trampoline page.
> The implementation is based on reserve_bios_regions(). We take a page
> next to end of lowmem.
> 
> We only need the page  for very short time, until main kernel image
> setup its own page tables.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> ---
>  arch/x86/boot/compressed/head_64.S | 87 ++++++++++++++++++++++++++------------
>  arch/x86/boot/compressed/misc.c    | 25 +++++++++++
>  2 files changed, 84 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index cefe4958fda9..961c72755986 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -288,8 +288,23 @@ ENTRY(startup_64)
>  	leaq	boot_stack_end(%rbx), %rsp
>  
>  #ifdef CONFIG_X86_5LEVEL
> +/*
> + * We need trampoline in lower memory switch from 4- to 5-level paging for
> + * cases when bootloader put kernel above 4G, but didn't enable 5-level paging
> + * for us.
> + *
> + * We also have to have top page table in lower memory as we don't have a way
> + * to load 64-bit value into CR3 from 32-bit mode. We only need 8-bytes there
> + * as we only use the very first entry of the page table.
> + *
> + * The same page can be used to place both trampoline code and top level page
> + * table. place_trampoline() will find suitable place for the trampoline page.
> + * Code will be placed with offset 0x100 from beginning of the page.
> + */
> +#define LVL5_TRAMPOLINE_CODE	0x100
> +
>  	/* Preserve RBX across CPUID */
> -	movq	%rbx, %r8
> +	movq	%rbx, %r15
>  
>  	/* Check if leaf 7 is supported */
>  	xorl	%eax, %eax
> @@ -307,9 +322,6 @@ ENTRY(startup_64)
>  	andl	$(1 << 16), %ecx
>  	jz	lvl5
>  
> -	/* Restore RBX */
> -	movq	%r8, %rbx
> -
>  	/* Check if 5-level paging has already been enabled */
>  	movq	%cr4, %rax
>  	testl	$X86_CR4_LA57, %eax
> @@ -323,34 +335,53 @@ ENTRY(startup_64)
>  	 * long mode would trigger #GP. So we need to switch off long mode
>  	 * first.
>  	 *
> -	 * NOTE: This is not going to work if bootloader put us above 4G
> -	 * limit.
> +	 * We use trampoline in lower memory to handle situation when
> +	 * bootloader put the kernel image above 4G.
>  	 *
>  	 * The first step is go into compatibility mode.
>  	 */
>  
> -	/* Clear additional page table */
> -	leaq	lvl5_pgtable(%rbx), %rdi
> -	xorq	%rax, %rax
> -	movq	$(PAGE_SIZE/8), %rcx
> -	rep	stosq
> +	/*
> +	 * Find sitable place for trampoline.
> +	 * The address will be stored in RBX.
> +	 */
> +	call	place_trampoline
> +	movq	%rax, %rbx
> +
> +	/* Preserve RSI, to be used by movsb below */
> +	movq	%rsi, %r14
> +
> +	/* Copy trampoline code in place */
> +	leaq	lvl5_trampoline_src(%rip), %rsi
> +	leaq	LVL5_TRAMPOLINE_CODE(%rbx), %rdi
> +	movq	$(lvl5_trampoline_end - lvl5_trampoline_src), %rcx
> +	rep	movsb
> +
> +	/* Restore RSI */
> +	movq	%r14, %rsi

Yeah, so first most of this code should be moved from assembly to C. Any reason 
why that cannot be done?

Cleanups like that are a precondition to adding this patch or other 5-level 
paging complications like the dynamic boot time switching.

Thanks,

	Ingo

Powered by blists - more mailing lists