linux-kernel - Re: [PATCH v5 09/16] kexec: enable KHO support for memory preservation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z-6TxZWEWbKSCqfh@kernel.org>
Date: Thu, 3 Apr 2025 16:57:25 +0300
From: Mike Rapoport <rppt@...nel.org>
To: Pratyush Yadav <ptyadav@...zon.de>
Cc: Changyuan Lyu <changyuanl@...gle.com>, linux-kernel@...r.kernel.org,
	graf@...zon.com, akpm@...ux-foundation.org, luto@...nel.org,
	anthony.yznaga@...cle.com, arnd@...db.de, ashish.kalra@....com,
	benh@...nel.crashing.org, bp@...en8.de, catalin.marinas@....com,
	dave.hansen@...ux.intel.com, dwmw2@...radead.org,
	ebiederm@...ssion.com, mingo@...hat.com, jgowans@...zon.com,
	corbet@....net, krzk@...nel.org, mark.rutland@....com,
	pbonzini@...hat.com, pasha.tatashin@...een.com, hpa@...or.com,
	peterz@...radead.org, robh+dt@...nel.org, robh@...nel.org,
	saravanak@...gle.com, skinsburskii@...ux.microsoft.com,
	rostedt@...dmis.org, tglx@...utronix.de, thomas.lendacky@....com,
	usama.arif@...edance.com, will@...nel.org,
	devicetree@...r.kernel.org, kexec@...ts.infradead.org,
	linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
	linux-mm@...ck.org, x86@...nel.org,
	Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH v5 09/16] kexec: enable KHO support for memory
 preservation

On Wed, Apr 02, 2025 at 07:16:27PM +0000, Pratyush Yadav wrote:
> Hi Changyuan,
> 
> On Wed, Mar 19 2025, Changyuan Lyu wrote:
> 
> > From: "Mike Rapoport (Microsoft)" <rppt@...nel.org>
> >
> > Introduce APIs allowing KHO users to preserve memory across kexec and
> > get access to that memory after boot of the kexeced kernel
> >
> > kho_preserve_folio() - record a folio to be preserved over kexec
> > kho_restore_folio() - recreates the folio from the preserved memory
> > kho_preserve_phys() - record physically contiguous range to be
> > preserved over kexec.
> > kho_restore_phys() - recreates order-0 pages corresponding to the
> > preserved physical range
> >
> > The memory preservations are tracked by two levels of xarrays to manage
> > chunks of per-order 512 byte bitmaps. For instance the entire 1G order
> > of a 1TB x86 system would fit inside a single 512 byte bitmap. For
> > order 0 allocations each bitmap will cover 16M of address space. Thus,
> > for 16G of memory at most 512K of bitmap memory will be needed for order 0.
> >
> > At serialization time all bitmaps are recorded in a linked list of pages
> > for the next kernel to process and the physical address of the list is
> > recorded in KHO FDT.
> >
> > The next kernel then processes that list, reserves the memory ranges and
> > later, when a user requests a folio or a physical range, KHO restores
> > corresponding memory map entries.
> >
> > Suggested-by: Jason Gunthorpe <jgg@...dia.com>
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
> > Co-developed-by: Changyuan Lyu <changyuanl@...gle.com>
> > Signed-off-by: Changyuan Lyu <changyuanl@...gle.com>
> > ---
> >  include/linux/kexec_handover.h |  38 +++
> >  kernel/kexec_handover.c        | 486 ++++++++++++++++++++++++++++++++-
> >  2 files changed, 522 insertions(+), 2 deletions(-)
> [...]
> > +int kho_preserve_phys(phys_addr_t phys, size_t size)
> > +{
> > +	unsigned long pfn = PHYS_PFN(phys), end_pfn = PHYS_PFN(phys + size);
> > +	unsigned int order = ilog2(end_pfn - pfn);
> 
> This caught my eye when playing around with the code. It does not put
> any limit on the order, so it can exceed NR_PAGE_ORDERS. Also, when

I don't see a problem with this

> initializing the page after KHO, we pass the order directly to
> prep_compound_page() without sanity checking it. The next kernel might
> not support all the orders the current one supports. Perhaps something
> to fix?

And this needs to be fixed and we should refuse to create folios larger
than MAX_ORDER.
 
> > +	unsigned long failed_pfn;
> > +	int err = 0;
> > +
> > +	if (!kho_enable)
> > +		return -EOPNOTSUPP;
> > +
> > +	down_read(&kho_out.tree_lock);
> > +	if (kho_out.fdt) {
> > +		err = -EBUSY;
> > +		goto unlock;
> > +	}
> > +
> > +	for (; pfn < end_pfn;
> > +	     pfn += (1 << order), order = ilog2(end_pfn - pfn)) {
> > +		err = __kho_preserve(&kho_mem_track, pfn, order);
> > +		if (err) {
> > +			failed_pfn = pfn;
> > +			break;
> > +		}
> > +	}
> [...
> > +struct folio *kho_restore_folio(phys_addr_t phys)
> > +{
> > +	struct page *page = pfn_to_online_page(PHYS_PFN(phys));
> > +	unsigned long order = page->private;
> > +
> > +	if (!page)
> > +		return NULL;
> > +
> > +	order = page->private;
> > +	if (order)
> > +		prep_compound_page(page, order);
> > +	else
> > +		kho_restore_page(page);
> > +
> > +	return page_folio(page);
> > +}
> [...]
> 
> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.