lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aWx5qHHYEXJXNuXm@kernel.org>
Date: Sun, 18 Jan 2026 08:11:52 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: Breno Leitao <leitao@...ian.org>,
	Pasha Tatashin <pasha.tatashin@...een.com>,
	akpm@...ux-foundation.org, graf@...zon.com,
	linux-kernel@...r.kernel.org, kexec@...ts.infradead.org,
	linux-mm@...ck.org, ricardo.neri-calderon@...ux.intel.com,
	kernel-team@...a.com
Subject: Re: [PATCH v4] kho: validate preserved memory map during population

On Fri, Jan 16, 2026 at 04:21:28PM +0000, Pratyush Yadav wrote:
> Hi Breno,
> 
> On Thu, Jan 08 2026, Breno Leitao wrote:
> 
> > Hello Pasha,
> >
> > On Tue, Dec 23, 2025 at 09:01:40AM -0500, Pasha Tatashin wrote:
> >> If the previous kernel enabled KHO but did not call kho_finalize()
> >> (e.g., CONFIG_LIVEUPDATE=n or userspace skipped the finalization step),
> >> the 'preserved-memory-map' property in the FDT remains empty/zero.
> >> 
> >> Previously, kho_populate() would succeed regardless of the memory map's
> >> state, reserving the incoming scratch regions in memblock. However,
> >> kho_memory_init() would later fail to deserialize the empty map. By that
> >> time, the scratch regions were already registered, leading to partial
> >> initialization and subsequent list corruption (freeing scratch area
> >> twice) during kho_init().
> >
> > While trying my new patchset [0] on top of this patch, I got the
> > following issue:
> >
> > 	[    0.000000] KHO: disabling KHO revival: -2
> >
> > Trying to solve it, I come up with a change in kho_get_mem_map_phys() to
> > distinguish no memory and error, see the patch attached later.
> >
> > This is what I used to test [0] on top of linux-next. Is this useful?
> >
> > Link: https://lore.kernel.org/all/20260108-kho-v3-1-b1d6b7a89342@debian.org/ [0]
> >
> > thanks
> > --breno
> >
> > commit 5d7855fede8110d74942e1b67056ba589a1cb54a
> > Author: Breno Leitao <leitao@...ian.org>
> > Date:   Thu Jan 8 07:44:08 2026 -0800
> >
> >     kho: allow KHO to work when no memory is preserved
> >     
> >     Fix KHO initialization failing when no memory pages were preserved by
> >     the previous kernel.
> >     
> >     Commit eda79a683a0a ("kho: validate preserved memory map during
> >     population") introduced kho_get_mem_map_phys() which returns the physical
> >     address of the preserved memory map directly as its return value. The
> >     caller then validates it with:
> >     
> >         mem_map_phys = kho_get_mem_map_phys(fdt);
> >         if (!mem_map_phys) {
> >             err = -ENOENT;
> >             goto out;
> >         }
> >     
> >     This creates an ambiguity: physical address 0 is used both as an error
> >     indicator (property missing/malformed) and as a valid value (property
> >     exists with value 0, meaning no memory was preserved).
> >     
> >     "No memory preserved" is a legitimate state. KHO provides features beyond
> >     memory page preservation, such as previous kernel version tracking and
> >     kexec count tracking. When the previous kernel enables KHO but doesn't
> >     preserve any memory pages, it sets 'preserved-memory-map' to 0. This is
> >     semantically different from "KHO not initialized" - it means "KHO is
> >     active, there's just nothing in the memory map."
> 
> This isn't true. If you hand over _any_ state, you will at least need
> the KHO FDT. And the KHO FDT is preserved memory (see the
> kho_alloc_preserve() call in kho_init()). So I don't see how you can
> ever have valid KHO with no memory.
> 
> mem_map_phys _can_ be 0, but only when KHO was enabled but not used. And
> that is of course also a valid use case.
> 
> We want to treat mem_map_phys == 0 the same as the error path, just
> without the error print. This lets us discard all previous scratch areas
> since they don't have anything useful anyway, and have a fresh start.
> 
> So while you are seeing this error message, I don't think it should
> break anything and KHO should still be working fine. You can
> double-check this by inspecting /sys/kernel/debug/kho/out.
> 
> So I think the patch is certainly a useful fix, it just needs some
> re-wording and fixups.
> 
> Some comments on the code below.
> 
> >     
> >     Before eda79a683a0a, the code handled this gracefully in
> >     kho_mem_deserialize():
> >     
> >         chunk = mem ? phys_to_virt(mem) : NULL;
> >         if (!chunk)
> >             return false;  // No pages, but KHO could still work
> >     
> >     After eda79a683a0a, the early validation conflated "no property" with
> >     "property value is 0", causing KHO to be completely disabled in both
> >     cases.
> >     
> >     Fix this by changing kho_get_mem_map_phys() to return an error code and
> >     pass the physical address via pointer. This allows distinguishing between:
> >      - Property missing/malformed: return -ENOENT (KHO fails)
> >      - Property exists with value 0: return 0 (KHO succeeds, no memory to
> >        restore)
> >     
> >     Fixes: eda79a683a0a ("kho: validate preserved memory map during population")
> >     Signed-off-by: Breno Leitao <leitao@...ian.org>
> >
> > diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> > index 271d90198a08..3cf2dc6840c9 100644
> > --- a/kernel/liveupdate/kexec_handover.c
> > +++ b/kernel/liveupdate/kexec_handover.c
> > @@ -471,8 +471,8 @@ static void __init deserialize_bitmap(unsigned int order,
> >  	}
> >  }
> >  
> > -/* Returns physical address of the preserved memory map from FDT */
> > -static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
> > +/* Returns 0 on success and stores physical address in *phys_out */
> > +static int __init kho_get_mem_map_phys(const void *fdt, phys_addr_t *phys_out)
> >  {
> >  	const void *mem_ptr;
> >  	int len;
> > @@ -480,10 +480,11 @@ static phys_addr_t __init kho_get_mem_map_phys(const void *fdt)
> >  	mem_ptr = fdt_getprop(fdt, 0, KHO_FDT_MEMORY_MAP_PROP_NAME, &len);
> >  	if (!mem_ptr || len != sizeof(u64)) {
> >  		pr_err("failed to get preserved memory bitmaps\n");
> > -		return 0;
> > +		return -ENOENT;
> >  	}
> >  
> > -	return get_unaligned((const u64 *)mem_ptr);
> > +	*phys_out = get_unaligned((const u64 *)mem_ptr);
> > +	return 0;
> >  }
> >  
> >  static void __init kho_mem_deserialize(struct khoser_mem_chunk *chunk)
> > @@ -1439,7 +1440,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
> >  			 phys_addr_t scratch_phys, u64 scratch_len)
> >  {
> >  	struct kho_scratch *scratch = NULL;
> > -	phys_addr_t mem_map_phys;
> > +	phys_addr_t mem_map_phys = 0;
> >  	void *fdt = NULL;
> >  	int err = 0;
> >  	unsigned int scratch_cnt = scratch_len / sizeof(*kho_scratch);
> > @@ -1466,11 +1467,9 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
> >  		goto out;
> >  	}
> >  
> > -	mem_map_phys = kho_get_mem_map_phys(fdt);
> > -	if (!mem_map_phys) {
> > -		err = -ENOENT;
> > +	err = kho_get_mem_map_phys(fdt, &mem_map_phys);
> > +	if (err)
> 
> This will break when mem_map_phys == 0. As I explained earlier, when
> that happens we want to discard all previous scratch info and start with
> a clean slate.
> 
> Making this if (err || !mem_map_phys) should do the trick. The if (err)
> check before the print should make sure the error message is not printed
> when we have a valid property but its value is 0.

While we are on it, I'd suggest to change kho_populate() error handling to
use goto, (i.e like below)
Then a simple if (err) will do and that's much clearer.

Another thing I noticed it that assigning err to -EFAULT or -EINVAL after
printks is completely redundant, since we anyway report what went wrong, so
printing the error value in the end just not needed.

diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
index feffeafa51b7..2bba111149c4 100644
--- a/kernel/liveupdate/kexec_handover.c
+++ b/kernel/liveupdate/kexec_handover.c
@@ -1453,27 +1453,27 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
 	if (!fdt) {
 		pr_warn("setup: failed to memremap FDT (0x%llx)\n", fdt_phys);
 		err = -EFAULT;
-		goto out;
+		goto err_report;
 	}
 	err = fdt_check_header(fdt);
 	if (err) {
 		pr_warn("setup: handover FDT (0x%llx) is invalid: %d\n",
 			fdt_phys, err);
 		err = -EINVAL;
-		goto out;
+		goto err_unmap_fdt;
 	}
 	err = fdt_node_check_compatible(fdt, 0, KHO_FDT_COMPATIBLE);
 	if (err) {
 		pr_warn("setup: handover FDT (0x%llx) is incompatible with '%s': %d\n",
 			fdt_phys, KHO_FDT_COMPATIBLE, err);
 		err = -EINVAL;
-		goto out;
+		goto err_unmap_fdt;
 	}
 
 	mem_map_phys = kho_get_mem_map_phys(fdt);
 	if (!mem_map_phys) {
 		err = -ENOENT;
-		goto out;
+		goto err_unmap_fdt;
 	}
 
 	scratch = early_memremap(scratch_phys, scratch_len);
@@ -1481,7 +1481,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
 		pr_warn("setup: failed to memremap scratch (phys=0x%llx, len=%lld)\n",
 			scratch_phys, scratch_len);
 		err = -EFAULT;
-		goto out;
+		goto err_unmap_scratch;
 	}
 
 	/*
@@ -1498,7 +1498,7 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
 		if (WARN_ON(err)) {
 			pr_warn("failed to mark the scratch region 0x%pa+0x%pa: %pe",
 				&area->addr, &size, ERR_PTR(err));
-			goto out;
+			goto err_unmap_scratch;
 		}
 		pr_debug("Marked 0x%pa+0x%pa as scratch", &area->addr, &size);
 	}
@@ -1520,13 +1520,14 @@ void __init kho_populate(phys_addr_t fdt_phys, u64 fdt_len,
 	kho_scratch_cnt = scratch_cnt;
 	pr_info("found kexec handover data.\n");
 
-out:
-	if (fdt)
-		early_memunmap(fdt, fdt_len);
-	if (scratch)
-		early_memunmap(scratch, scratch_len);
-	if (err)
-		pr_warn("disabling KHO revival: %d\n", err);
+	return;
+
+err_unmap_scratch:
+	early_memunmap(scratch, scratch_len);
+err_unmap_fdt:
+	early_memunmap(fdt, fdt_len);
+err_report:
+	pr_warn("disabling KHO revival: %d\n", err);
 }
 
 /* Helper functions for kexec_file_load */
 
> >  		goto out;
> > -	}
> >  
> >  	scratch = early_memremap(scratch_phys, scratch_len);
> >  	if (!scratch) {
> 
> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ