linux-kernel - Re: [PATCH] liveupdate/kho: Warn when kho_scratch is insufficient for sparsemem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aVP9NHDn0WFtkMNP@kernel.org>
Date: Tue, 30 Dec 2025 18:26:28 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Li Chen <me@...ux.beauty>
Cc: Alexander Graf <graf@...zon.com>,
	Pasha Tatashin <pasha.tatashin@...een.com>,
	Pratyush Yadav <pratyush@...nel.org>, kexec@...ts.infradead.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] liveupdate/kho: Warn when kho_scratch is insufficient
 for sparsemem

Hi,

On Tue, Dec 30, 2025 at 01:53:45PM +0800, Li Chen wrote:
> With KHO enabled, the successor kernel can temporarily run memblock in
> scratch-only mode during early boot. In that mode, SPARSEMEM may allocate
> a per-node scratch buffer via sparse_buffer_init(map_count *
> section_map_size()), which requires a single contiguous, aligned memblock
> allocation.
> 
> If the maximum usable scratch range in a node is smaller than the
> estimated buffer size, kexec handover can hang very early in the
> successor kernel, and we may even have no chance to see the error on
> the console.
> 
> Estimate the worst-case per-node requirement from the running kernel's
> sparsemem layout and compare it against the reserved scratch list by
> splitting scratch ranges per nid, sorting and merging them, and applying
> the section_map_size() alignment constraint. Warn once when scratch
> appears too small.
> 
> This check is a heuristic based on the running kernel's sparsemem layout
> and cannot account for all differences in a successor kernel. Keep it as
> a warning instead of rejecting kexec loads to avoid false positives
> causing unexpected regressions. Users can adjust kho_scratch accordingly
> before attempting a handover.
> 
> To reduce boot-time overhead(particularly on large NUMA servers), run
> the check from a late initcall via system_long_wq instead of in
> kho_reserve_scratch().
> 
> Signed-off-by: Li Chen <me@...ux.beauty>
> ---
>  kernel/liveupdate/kexec_handover.c | 396 +++++++++++++++++++++++++++++
>  1 file changed, 396 insertions(+)

This is an overkill for something that a pr_err() or a panic() would be
sufficient.

-- 
Sincerely yours,
Mike.