lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250729172812.GP36037@nvidia.com>
Date: Tue, 29 Jul 2025 14:28:12 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: pratyush@...nel.org, jasonmiu@...gle.com, graf@...zon.com,
	changyuanl@...gle.com, rppt@...nel.org, dmatlack@...gle.com,
	rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
	ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
	ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
	akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
	mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
	axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
	vincent.guittot@...aro.org, hannes@...xchg.org,
	dan.j.williams@...el.com, david@...hat.com,
	joel.granados@...nel.org, rostedt@...dmis.org,
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn,
	linux@...ssschuh.net, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-mm@...ck.org,
	gregkh@...uxfoundation.org, tglx@...utronix.de, mingo@...hat.com,
	bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
	myungjoo.ham@...sung.com, yesanishhere@...il.com,
	Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
	aleksander.lobakin@...el.com, ira.weiny@...el.com,
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
	stuart.w.hayes@...il.com, ptyadav@...zon.de, lennart@...ttering.net,
	brauner@...nel.org, linux-api@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, saeedm@...dia.com,
	ajayachandra@...dia.com, parav@...dia.com, leonro@...dia.com,
	witu@...dia.com
Subject: Re: [PATCH v2 10/32] liveupdate: luo_core: Live Update Orchestrator

On Wed, Jul 23, 2025 at 02:46:23PM +0000, Pasha Tatashin wrote:
> Introduce LUO, a mechanism intended to facilitate kernel updates while
> keeping designated devices operational across the transition (e.g., via
> kexec). The primary use case is updating hypervisors with minimal
> disruption to running virtual machines. For userspace side of hypervisor
> update we have copyless migration. LUO is for updating the kernel.
> 
> This initial patch lays the groundwork for the LUO subsystem.
> 
> Further functionality, including the implementation of state transition
> logic, integration with KHO, and hooks for subsystems and file
> descriptors, will be added in subsequent patches.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@...een.com>
> ---
>  include/linux/liveupdate.h       | 140 ++++++++++++++
>  kernel/liveupdate/Kconfig        |  27 +++
>  kernel/liveupdate/Makefile       |   1 +
>  kernel/liveupdate/luo_core.c     | 301 +++++++++++++++++++++++++++++++
>  kernel/liveupdate/luo_internal.h |  21 +++
>  5 files changed, 490 insertions(+)
>  create mode 100644 include/linux/liveupdate.h
>  create mode 100644 kernel/liveupdate/luo_core.c
>  create mode 100644 kernel/liveupdate/luo_internal.h
> 
> diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h
> new file mode 100644
> index 000000000000..da8f05c81e51
> --- /dev/null
> +++ b/include/linux/liveupdate.h
> @@ -0,0 +1,140 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * Copyright (c) 2025, Google LLC.
> + * Pasha Tatashin <pasha.tatashin@...een.com>
> + */
> +#ifndef _LINUX_LIVEUPDATE_H
> +#define _LINUX_LIVEUPDATE_H
> +
> +#include <linux/bug.h>
> +#include <linux/types.h>
> +#include <linux/list.h>
> +
> +/**
> + * enum liveupdate_event - Events that trigger live update callbacks.
> + * @LIVEUPDATE_PREPARE: PREPARE should happen *before* the blackout window.
> + *                      Subsystems should prepare for an upcoming reboot by
> + *                      serializing their states. However, it must be considered
> + *                      that user applications, e.g. virtual machines are still
> + *                      running during this phase.
> + * @LIVEUPDATE_FREEZE:  FREEZE sent from the reboot() syscall, when the current
> + *                      kernel is on its way out. This is the final opportunity
> + *                      for subsystems to save any state that must persist
> + *                      across the reboot. Callbacks for this event should be as
> + *                      fast as possible since they are on the critical path of
> + *                      rebooting into the next kernel.
> + * @LIVEUPDATE_FINISH:  FINISH is sent in the newly booted kernel after a
> + *                      successful live update and normally *after* the blackout
> + *                      window. Subsystems should perform any final cleanup
> + *                      during this phase. This phase also provides an
> + *                      opportunity to clean up devices that were preserved but
> + *                      never explicitly reclaimed during the live update
> + *                      process. State restoration should have already occurred
> + *                      before this event. Callbacks for this event must not
> + *                      fail. The completion of this call transitions the
> + *                      machine from ``updated`` to ``normal`` state.
> + * @LIVEUPDATE_CANCEL:  CANCEL the live update and go back to normal state. This
> + *                      event is user initiated, or is done automatically when
> + *                      LIVEUPDATE_PREPARE or LIVEUPDATE_FREEZE stage fails.
> + *                      Subsystems should revert any actions taken during the
> + *                      corresponding prepare event. Callbacks for this event
> + *                      must not fail.
> + *
> + * These events represent the different stages and actions within the live
> + * update process that subsystems (like device drivers and bus drivers)
> + * need to be aware of to correctly serialize and restore their state.
> + *
> + */
> +enum liveupdate_event {
> +	LIVEUPDATE_PREPARE,
> +	LIVEUPDATE_FREEZE,
> +	LIVEUPDATE_FINISH,
> +	LIVEUPDATE_CANCEL,
> +};

I saw a later patch moves these hunks, that is poor patch planning.

Ideally an ioctl subsystem should start out with the first patch
introducing the basic cdev, file open, ioctl dispatch, ioctl uapi
header and related simple infrastructure.

Then you'd go basically ioctl by ioctl adding the new ioctls and
explaining what they do in the patch commit messages.

> +/**
> + * liveupdate_state_updated - Check if the system is in the live update
> + * 'updated' state.
> + *
> + * This function checks if the live update orchestrator is in the
> + * ``LIVEUPDATE_STATE_UPDATED`` state. This state indicates that the system has
> + * successfully rebooted into a new kernel as part of a live update, and the
> + * preserved devices are expected to be in the process of being reclaimed.
> + *
> + * This is typically used by subsystems during early boot of the new kernel
> + * to determine if they need to attempt to restore state from a previous
> + * live update.
> + *
> + * @return true if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state,
> + * false otherwise.
> + */
> +bool liveupdate_state_updated(void)
> +{
> +	return is_current_luo_state(LIVEUPDATE_STATE_UPDATED);
> +}
> +EXPORT_SYMBOL_GPL(liveupdate_state_updated);

Unless there are existing in tree users there should not be exports.

I'm also not really sure why there is global state, I would expect the
fd and session objects to record what kind of things they are, not
having weird globals.

Like liveupdate_register_subsystem() stuff, it already has a lock,
&luo_subsystem_list_mutex, if you want to block mutation of the list
then, IMHO, it makes more sense to stick a specific variable
'luo_subsystems_list_immutable' under that lock and make it very
obvious.

Stuff like luo_files_startup() feels clunky to me:

+       ret = liveupdate_register_subsystem(&luo_file_subsys);
+       if (ret) {
+               pr_warn("Failed to register luo_file subsystem [%d]\n", ret);
+               return ret;
+       }
+
+       if (liveupdate_state_updated()) {

Thats going to be a standard pattern - I would expect that
liveupdate_register_subsystem() would do the check for updated and
then arrange to call back something like
liveupdate_subsystem.ops.post_update()

And then post_update() would get the info that is currently under
liveupdate_get_subsystem_data() as arguments instead of having to make
more functions calls.

Maybe even the fdt_node_check_compatible() can be hoisted.

That would remove a bunch more liveupdate_state_updated() calls.

etc.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ