lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mafs05xjmqsqc.fsf@amazon.de>
Date: Wed, 2 Apr 2025 19:16:27 +0000
From: Pratyush Yadav <ptyadav@...zon.de>
To: Changyuan Lyu <changyuanl@...gle.com>
CC: <linux-kernel@...r.kernel.org>, <graf@...zon.com>,
	<akpm@...ux-foundation.org>, <luto@...nel.org>, <anthony.yznaga@...cle.com>,
	<arnd@...db.de>, <ashish.kalra@....com>, <benh@...nel.crashing.org>,
	<bp@...en8.de>, <catalin.marinas@....com>, <dave.hansen@...ux.intel.com>,
	<dwmw2@...radead.org>, <ebiederm@...ssion.com>, <mingo@...hat.com>,
	<jgowans@...zon.com>, <corbet@....net>, <krzk@...nel.org>, <rppt@...nel.org>,
	<mark.rutland@....com>, <pbonzini@...hat.com>, <pasha.tatashin@...een.com>,
	<hpa@...or.com>, <peterz@...radead.org>, <robh+dt@...nel.org>,
	<robh@...nel.org>, <saravanak@...gle.com>,
	<skinsburskii@...ux.microsoft.com>, <rostedt@...dmis.org>,
	<tglx@...utronix.de>, <thomas.lendacky@....com>, <usama.arif@...edance.com>,
	<will@...nel.org>, <devicetree@...r.kernel.org>, <kexec@...ts.infradead.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-doc@...r.kernel.org>,
	<linux-mm@...ck.org>, <x86@...nel.org>, Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation

Hi Changyuan,

On Wed, Mar 19 2025, Changyuan Lyu wrote:

> From: "Mike Rapoport (Microsoft)" <rppt@...nel.org>
>
> Introduce APIs allowing KHO users to preserve memory across kexec and
> get access to that memory after boot of the kexeced kernel
>
> kho_preserve_folio() - record a folio to be preserved over kexec
> kho_restore_folio() - recreates the folio from the preserved memory
> kho_preserve_phys() - record physically contiguous range to be
> preserved over kexec.
> kho_restore_phys() - recreates order-0 pages corresponding to the
> preserved physical range
>
> The memory preservations are tracked by two levels of xarrays to manage
> chunks of per-order 512 byte bitmaps. For instance the entire 1G order
> of a 1TB x86 system would fit inside a single 512 byte bitmap. For
> order 0 allocations each bitmap will cover 16M of address space. Thus,
> for 16G of memory at most 512K of bitmap memory will be needed for order 0.
>
> At serialization time all bitmaps are recorded in a linked list of pages
> for the next kernel to process and the physical address of the list is
> recorded in KHO FDT.
>
> The next kernel then processes that list, reserves the memory ranges and
> later, when a user requests a folio or a physical range, KHO restores
> corresponding memory map entries.
>
> Suggested-by: Jason Gunthorpe <jgg@...dia.com>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
> Co-developed-by: Changyuan Lyu <changyuanl@...gle.com>
> Signed-off-by: Changyuan Lyu <changyuanl@...gle.com>
> ---
>  include/linux/kexec_handover.h |  38 +++
>  kernel/kexec_handover.c        | 486 ++++++++++++++++++++++++++++++++-
>  2 files changed, 522 insertions(+), 2 deletions(-)
[...]
> +int kho_preserve_phys(phys_addr_t phys, size_t size)
> +{
> +	unsigned long pfn = PHYS_PFN(phys), end_pfn = PHYS_PFN(phys + size);
> +	unsigned int order = ilog2(end_pfn - pfn);

This caught my eye when playing around with the code. It does not put
any limit on the order, so it can exceed NR_PAGE_ORDERS. Also, when
initializing the page after KHO, we pass the order directly to
prep_compound_page() without sanity checking it. The next kernel might
not support all the orders the current one supports. Perhaps something
to fix?

> +	unsigned long failed_pfn;
> +	int err = 0;
> +
> +	if (!kho_enable)
> +		return -EOPNOTSUPP;
> +
> +	down_read(&kho_out.tree_lock);
> +	if (kho_out.fdt) {
> +		err = -EBUSY;
> +		goto unlock;
> +	}
> +
> +	for (; pfn < end_pfn;
> +	     pfn += (1 << order), order = ilog2(end_pfn - pfn)) {
> +		err = __kho_preserve(&kho_mem_track, pfn, order);
> +		if (err) {
> +			failed_pfn = pfn;
> +			break;
> +		}
> +	}
[...
> +struct folio *kho_restore_folio(phys_addr_t phys)
> +{
> +	struct page *page = pfn_to_online_page(PHYS_PFN(phys));
> +	unsigned long order = page->private;
> +
> +	if (!page)
> +		return NULL;
> +
> +	order = page->private;
> +	if (order)
> +		prep_compound_page(page, order);
> +	else
> +		kho_restore_page(page);
> +
> +	return page_folio(page);
> +}
[...]

-- 
Regards,
Pratyush Yadav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ