[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aVP9NHDn0WFtkMNP@kernel.org>
Date: Tue, 30 Dec 2025 18:26:28 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Li Chen <me@...ux.beauty>
Cc: Alexander Graf <graf@...zon.com>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Pratyush Yadav <pratyush@...nel.org>, kexec@...ts.infradead.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] liveupdate/kho: Warn when kho_scratch is insufficient
for sparsemem
Hi,
On Tue, Dec 30, 2025 at 01:53:45PM +0800, Li Chen wrote:
> With KHO enabled, the successor kernel can temporarily run memblock in
> scratch-only mode during early boot. In that mode, SPARSEMEM may allocate
> a per-node scratch buffer via sparse_buffer_init(map_count *
> section_map_size()), which requires a single contiguous, aligned memblock
> allocation.
>
> If the maximum usable scratch range in a node is smaller than the
> estimated buffer size, kexec handover can hang very early in the
> successor kernel, and we may even have no chance to see the error on
> the console.
>
> Estimate the worst-case per-node requirement from the running kernel's
> sparsemem layout and compare it against the reserved scratch list by
> splitting scratch ranges per nid, sorting and merging them, and applying
> the section_map_size() alignment constraint. Warn once when scratch
> appears too small.
>
> This check is a heuristic based on the running kernel's sparsemem layout
> and cannot account for all differences in a successor kernel. Keep it as
> a warning instead of rejecting kexec loads to avoid false positives
> causing unexpected regressions. Users can adjust kho_scratch accordingly
> before attempting a handover.
>
> To reduce boot-time overhead(particularly on large NUMA servers), run
> the check from a late initcall via system_long_wq instead of in
> kho_reserve_scratch().
>
> Signed-off-by: Li Chen <me@...ux.beauty>
> ---
> kernel/liveupdate/kexec_handover.c | 396 +++++++++++++++++++++++++++++
> 1 file changed, 396 insertions(+)
This is an overkill for something that a pr_err() or a panic() would be
sufficient.
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists