lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aXILzST3NxyYXW1m@gmail.com>
Date: Thu, 22 Jan 2026 04:04:55 -0800
From: Breno Leitao <leitao@...ian.org>
To: Mike Rapoport <rppt@...nel.org>
Cc: Alexander Graf <graf@...zon.com>, 
	Pasha Tatashin <pasha.tatashin@...een.com>, Pratyush Yadav <pratyush@...nel.org>, 
	linux-kernel@...r.kernel.org, kexec@...ts.infradead.org, linux-mm@...ck.org, 
	usamaarif642@...il.com, rmikey@...a.com, clm@...com, riel@...riel.com, 
	kernel-team@...a.com, SeongJae Park <sj@...nel.org>
Subject: Re: [PATCH v4] kho: kexec-metadata: track previous kernel chain

Hello Mike,

On Thu, Jan 22, 2026 at 12:57:50PM +0200, Mike Rapoport wrote:
> > +/**
> > + * DOC: Kexec Metadata ABI
> > + *
> 
> It would be nice to link it from Documentation/ as well ;-)

Ack! I am planning something as:

	commit 90e098ca0d611b44594f08e50ba1cff3c932dd2b
	Author: Breno Leitao <leitao@...ian.org>
	Date:   Thu Jan 22 03:47:23 2026 -0800

	kho: document kexec-metadata tracking feature
	
	Add documentation for the kexec-metadata feature that tracks the
	previous kernel version and kexec boot count across kexec reboots.
	This helps diagnose bugs that only reproduce when kexecing from
	specific kernel versions.
	
	Suggested-by: Mike Rapoport <rppt@...nel.org>
	Signed-off-by: Breno Leitao <leitao@...ian.org>

	diff --git a/Documentation/admin-guide/mm/kho.rst b/Documentation/admin-guide/mm/kho.rst
	index 6dc18ed4b8861..1faf2c3ba4620 100644
	--- a/Documentation/admin-guide/mm/kho.rst
	+++ b/Documentation/admin-guide/mm/kho.rst
	@@ -113,3 +113,42 @@ stabilized.
	``/sys/kernel/debug/kho/in/sub_fdts/``
	Similar to ``kho/out/sub_fdts/``, but contains sub FDT blobs
	of KHO producers passed from the old kernel.
	+
	+Kexec Metadata
	+==============
	+
	+KHO automatically tracks metadata about the kexec chain, passing information
	+about the previous kernel to the next kernel. This feature helps diagnose
	+bugs that only reproduce when kexecing from specific kernel versions.
	+
	+On each KHO kexec, the kernel logs the previous kernel's version and the
	+number of kexec reboots since the last cold boot::
	+
	+    [    0.000000] KHO: exec from: 6.19.0-rc4-next-20260107 (count 1)
	+
	+The metadata includes:
	+
	+``previous_release``
	+    The kernel version string (from ``uname -r``) of the kernel that
	+    initiated the kexec.
	+
	+``kexec_count``
	+    The number of kexec boots since the last cold boot. On cold boot,
	+    this counter starts at 0 and increments with each kexec. This helps
	+    identify issues that only manifest after multiple consecutive kexec
	+    reboots.
	+
	+Use Cases
	+---------
	+
	+This metadata is particularly useful for debugging kexec transition bugs,
	+where a buggy kernel kexecs into a new kernel and the bug manifests only
	+in the second kernel. Examples of such bugs include:
	+
	+- Memory corruption from the previous kernel affecting the new kernel
	+- Incorrect hardware state left by the previous kernel
	+- Firmware/ACPI state issues that only appear in kexec scenarios
	+
	+At scale, correlating crashes to the previous kernel version enables
	+faster root cause analysis when issues only occur in specific kernel
	+transition scenarios.


> > diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c
> 
> ...
> 
> >  static __init int kho_init(void)
> >  {
> >  	const void *fdt = kho_get_fdt();
> > @@ -1357,6 +1413,15 @@ static __init int kho_init(void)
> >  	if (err)
> >  		goto err_free_fdt;
> >  
> > +	if (fdt)
> > +		kho_process_kexec_metadata();
> 
> Can't we move it into the existing if (fdt) below?

Unfortunately, that won't work due to a data dependency between the two
functions.

kho_process_kexec_metadata() reads from the FDT subtree and populates kho_in:

Basically:

	kho_in.kexec_count = metadata->kexec_count;

While kho_populate_kexec_metadata() increments metadata->kexec_count:

          /* kho_in.kexec_count is set to 0 on cold boot */
          metadata->kexec_count = kho_in.kexec_count + 1;

If kho_process_kexec_metadata() is moved after kho_populate_kexec_metadata(),
the count would always increment from 0 to 1, ignoring whatever was passed in
the FDT.

Restructuring to call kho_in_debugfs_init() earlier also doesn't work:


	if (fdt) {
		kho_in_debugfs_init(&kho_in.dbg, fdt);
		kho_process_kexec_metadata();
		return 0;
	}

	/* Populate kexec metadata for the possible next kexec */
	err = kho_populate_kexec_metadata();
	if (err)
                  pr_warn("failed to initialize kexec-metadata subtree: %d\n",
                          err);

This would return early without populating the kexec metadata for the next
kexec, breaking the chain on KHO boots.

Please let me know if I am missing any other option.

> > +
> > +	/* Populate kexec metadata for the possible next kexec */
> > +	err = kho_populate_kexec_metadata();
> > +	if (err)
> > +		pr_warn("failed to initialize kexec-metadata subtree: %d\n",
> > +			err);
> 
> Please follow if (err) goto err_ pattern.
> 
> kho_populate_kexec_metadata() failure essentially means that we failed to
> allocate memory. This shouldn't happen that early in boot, but if it did,
> then something is utterly wrong.

Ack!

Thanks for the review,
--breno

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ