[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <z2lraqiivexvfvokff4gmqyvbpswmfwhbkhv4u7lo3nnnyz346@mibnt7dkrvey>
Date: Mon, 19 Jan 2026 10:54:09 -0800
From: Breno Leitao <leitao@...ian.org>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, akpm@...ux-foundation.org,
rppt@...nel.org, graf@...zon.com, linux-kernel@...r.kernel.org,
kexec@...ts.infradead.org, linux-mm@...ck.org, ricardo.neri-calderon@...ux.intel.com,
kernel-team@...a.com
Subject: Re: [PATCH v4] kho: validate preserved memory map during population
On Fri, Jan 16, 2026 at 04:21:28PM +0000, Pratyush Yadav wrote:
> > On Tue, Dec 23, 2025 at 09:01:40AM -0500, Pasha Tatashin wrote:
> >> If the previous kernel enabled KHO but did not call kho_finalize()
> >> (e.g., CONFIG_LIVEUPDATE=n or userspace skipped the finalization step),
> >> the 'preserved-memory-map' property in the FDT remains empty/zero.
> >>
> >> Previously, kho_populate() would succeed regardless of the memory map's
> >> state, reserving the incoming scratch regions in memblock. However,
> >> kho_memory_init() would later fail to deserialize the empty map. By that
> >> time, the scratch regions were already registered, leading to partial
> >> initialization and subsequent list corruption (freeing scratch area
> >> twice) during kho_init().
> >
> > While trying my new patchset [0] on top of this patch, I got the
> > following issue:
> >
> > [ 0.000000] KHO: disabling KHO revival: -2
> >
> > Trying to solve it, I come up with a change in kho_get_mem_map_phys() to
> > distinguish no memory and error, see the patch attached later.
> >
> > This is what I used to test [0] on top of linux-next. Is this useful?
> >
> > Link: https://lore.kernel.org/all/20260108-kho-v3-1-b1d6b7a89342@debian.org/ [0]
> >
> > thanks
> > --breno
> >
> > commit 5d7855fede8110d74942e1b67056ba589a1cb54a
> > Author: Breno Leitao <leitao@...ian.org>
> > Date: Thu Jan 8 07:44:08 2026 -0800
> >
> > kho: allow KHO to work when no memory is preserved
> >
> > Fix KHO initialization failing when no memory pages were preserved by
> > the previous kernel.
> >
> > Commit eda79a683a0a ("kho: validate preserved memory map during
> > population") introduced kho_get_mem_map_phys() which returns the physical
> > address of the preserved memory map directly as its return value. The
> > caller then validates it with:
> >
> > mem_map_phys = kho_get_mem_map_phys(fdt);
> > if (!mem_map_phys) {
> > err = -ENOENT;
> > goto out;
> > }
> >
> > This creates an ambiguity: physical address 0 is used both as an error
> > indicator (property missing/malformed) and as a valid value (property
> > exists with value 0, meaning no memory was preserved).
> >
> > "No memory preserved" is a legitimate state. KHO provides features beyond
> > memory page preservation, such as previous kernel version tracking and
> > kexec count tracking. When the previous kernel enables KHO but doesn't
> > preserve any memory pages, it sets 'preserved-memory-map' to 0. This is
> > semantically different from "KHO not initialized" - it means "KHO is
> > active, there's just nothing in the memory map."
>
> This isn't true. If you hand over _any_ state, you will at least need
> the KHO FDT. And the KHO FDT is preserved memory (see the
> kho_alloc_preserve() call in kho_init()). So I don't see how you can
> ever have valid KHO with no memory.
>
> mem_map_phys _can_ be 0, but only when KHO was enabled but not used. And
> that is of course also a valid use case.
Oh, I was not finalizing KHO, and in commit e1c3bfd091f363c1
("kho: validate preserved memory map during population") started to fail this
on purpose.
So, I understand we want to fail if mem_map_phys = 0, but thn FDT was properly
passed (how is it possible)? I know I can read
KHO_PROP_PREVIOUS_RELEASE/KHO_PROP_PREVIOUS_RELEASE from the fdt, even when
mem_map_phys is 0.
Powered by blists - more mailing lists