lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 20 Feb 2024 12:30:01 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Alexander Graf <graf@...zon.com>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
	linux-mm@...ck.org, devicetree@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, kexec@...ts.infradead.org,
	linux-doc@...r.kernel.org, x86@...nel.org,
	Eric Biederman <ebiederm@...ssion.com>,
	"H . Peter Anvin" <hpa@...or.com>,
	Andy Lutomirski <luto@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mark Rutland <mark.rutland@....com>,
	Tom Lendacky <thomas.lendacky@....com>,
	Ashish Kalra <ashish.kalra@....com>,
	James Gowans <jgowans@...zon.com>,
	Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>,
	arnd@...db.de, pbonzini@...hat.com, madvenka@...ux.microsoft.com,
	Anthony Yznaga <anthony.yznaga@...cle.com>,
	Usama Arif <usama.arif@...edance.com>,
	David Woodhouse <dwmw@...zon.co.uk>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Rob Herring <robh+dt@...nel.org>,
	Krzysztof Kozlowski <krzk@...nel.org>
Subject: Re: [PATCH v3 09/17] x86: Add KHO support

Hi Alex,

On Wed, Jan 17, 2024 at 02:46:56PM +0000, Alexander Graf wrote:
> We now have all bits in place to support KHO kexecs. This patch adds
> awareness of KHO in the kexec file as well as boot path for x86 and
> adds the respective kconfig option to the architecture so that it can
> use KHO successfully.
> 
> In addition, it enlightens it decompression code with KHO so that its
> KASLR location finder only considers memory regions that are not already
> occupied by KHO memory.
> 
> Signed-off-by: Alexander Graf <graf@...zon.com>
> 
> ---
> 
> v1 -> v2:
> 
>   - Change kconfig option to ARCH_SUPPORTS_KEXEC_KHO
>   - s/kho_reserve_mem/kho_reserve_previous_mem/g
>   - s/kho_reserve/kho_reserve_scratch/g
> ---
>  arch/x86/Kconfig                      |  3 ++
>  arch/x86/boot/compressed/kaslr.c      | 55 +++++++++++++++++++++++++++
>  arch/x86/include/uapi/asm/bootparam.h | 15 +++++++-
>  arch/x86/kernel/e820.c                |  9 +++++
>  arch/x86/kernel/kexec-bzimage64.c     | 39 +++++++++++++++++++
>  arch/x86/kernel/setup.c               | 46 ++++++++++++++++++++++
>  arch/x86/mm/init_32.c                 |  7 ++++
>  arch/x86/mm/init_64.c                 |  7 ++++
>  8 files changed, 180 insertions(+), 1 deletion(-)

..

> @@ -987,8 +1013,26 @@ void __init setup_arch(char **cmdline_p)
>  	cleanup_highmap();
>  
>  	memblock_set_current_limit(ISA_END_ADDRESS);
> +
>  	e820__memblock_setup();
>  
> +	/*
> +	 * We can resize memblocks at this point, let's dump all KHO
> +	 * reservations in and switch from scratch-only to normal allocations
> +	 */
> +	kho_reserve_previous_mem();
> +
> +	/* Allocations now skip scratch mem, return low 1M to the pool */
> +	if (is_kho_boot()) {
> +		u64 i;
> +		phys_addr_t base, end;
> +
> +		__for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE,
> +				     MEMBLOCK_SCRATCH, &base, &end, NULL)
> +			if (end <= ISA_END_ADDRESS)
> +				memblock_clear_scratch(base, end - base);
> +	}

You had to mark lower 16M as MEMBLOCK_SCRATCH because at this point the
mapping of the physical memory is not ready yet and page tables only cover
lower 16M and the memory mapped in kexec::init_pgtable(). Hence the call
for memblock_set_current_limit(ISA_END_ADDRESS) slightly above, which
essentially makes scratch mem reserved by KHO unusable for allocations.

I'd suggest to move kho_reserve_previous_mem() earlier, probably even right
next to kho_populate().
kho_populate() already does memblock_add(scratch) and at that point it's
the only physical memory that memblock knows of, so if it'll have to
allocate, the allocations will end up there.

Also, there are no kernel allocations before e820__memblock_setup(), so the
only memory that might need to be allocated is for memblock_double_array()
and that will be discarded later anyway.

With this, it seems that MEMBLOCK_SCRATCH is not needed, as the scratch
memory is anyway the only usable memory up to e820__memblock_setup().

>  	/*
>  	 * Needs to run after memblock setup because it needs the physical
>  	 * memory size.
> @@ -1104,6 +1148,8 @@ void __init setup_arch(char **cmdline_p)
>  	 */
>  	arch_reserve_crashkernel();
>  
> +	kho_reserve_scratch();
> +
>  	memblock_find_dma_reserve();
>  
>  	if (!early_xdbc_setup_hardware())
> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
> index b63403d7179d..6c3810afed04 100644
> --- a/arch/x86/mm/init_32.c
> +++ b/arch/x86/mm/init_32.c
> @@ -20,6 +20,7 @@
>  #include <linux/smp.h>
>  #include <linux/init.h>
>  #include <linux/highmem.h>
> +#include <linux/kexec.h>
>  #include <linux/pagemap.h>
>  #include <linux/pci.h>
>  #include <linux/pfn.h>
> @@ -738,6 +739,12 @@ void __init mem_init(void)
>  	after_bootmem = 1;
>  	x86_init.hyper.init_after_bootmem();
>  
> +	/*
> +	 * Now that all KHO pages are marked as reserved, let's flip them back
> +	 * to normal pages with accurate refcount.
> +	 */
> +	kho_populate_refcount();

This should go to mm_core_init(), there's nothing architecture specific
there.

> +
>  	/*
>  	 * Check boundaries twice: Some fundamental inconsistencies can
>  	 * be detected at build time already.

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ