lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bDoq=SxX_YV2S+YpXRi_a0eWOH+HC7u4NO9F-+YcPD5ew@mail.gmail.com>
Date: Sun, 8 Jun 2025 09:13:49 -0400
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: pratyush@...nel.org, jasonmiu@...gle.com, graf@...zon.com, 
	changyuanl@...gle.com, dmatlack@...gle.com, rientjes@...gle.com, 
	corbet@....net, rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com, 
	kanie@...ux.alibaba.com, ojeda@...nel.org, aliceryhl@...gle.com, 
	masahiroy@...nel.org, akpm@...ux-foundation.org, tj@...nel.org, 
	yoann.congal@...le.fr, mmaurer@...gle.com, roman.gushchin@...ux.dev, 
	chenridong@...wei.com, axboe@...nel.dk, mark.rutland@....com, 
	jannh@...gle.com, vincent.guittot@...aro.org, hannes@...xchg.org, 
	dan.j.williams@...el.com, david@...hat.com, joel.granados@...nel.org, 
	rostedt@...dmis.org, anna.schumaker@...cle.com, song@...nel.org, 
	zhangguopeng@...inos.cn, linux@...ssschuh.net, linux-kernel@...r.kernel.org, 
	linux-doc@...r.kernel.org, linux-mm@...ck.org, gregkh@...uxfoundation.org, 
	tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, 
	dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com, rafael@...nel.org, 
	dakr@...nel.org, bartosz.golaszewski@...aro.org, cw00.choi@...sung.com, 
	myungjoo.ham@...sung.com, yesanishhere@...il.com, Jonathan.Cameron@...wei.com, 
	quic_zijuhu@...cinc.com, aleksander.lobakin@...el.com, ira.weiny@...el.com, 
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de, 
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com, 
	stuart.w.hayes@...il.com, ptyadav@...zon.de
Subject: Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs

On Mon, May 26, 2025 at 3:55 AM Mike Rapoport <rppt@...nel.org> wrote:
>
> On Thu, May 15, 2025 at 06:23:12PM +0000, Pasha Tatashin wrote:
> > Introduce the framework within LUO to support preserving specific types
> > of file descriptors across a live update transition. This allows
> > stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in
> > the new kernel.
> >
> > Note: The core logic for iterating through the luo_files_list and
> > invoking the handler callbacks (prepare, freeze, cancel, finish)
> > within luo_do_files_*_calls, as well as managing the u64 data
> > persistence via the FDT for individual files, is currently implemented
> > as stubs in this patch. This patch sets up the registration, FDT layout,
> > and retrieval framework.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@...een.com>
> > ---
> >  drivers/misc/liveupdate/Makefile       |   1 +
> >  drivers/misc/liveupdate/luo_core.c     |  19 +
> >  drivers/misc/liveupdate/luo_files.c    | 563 +++++++++++++++++++++++++
> >  drivers/misc/liveupdate/luo_internal.h |  11 +
> >  include/linux/liveupdate.h             |  62 +++
> >  5 files changed, 656 insertions(+)
> >  create mode 100644 drivers/misc/liveupdate/luo_files.c
> >
> > diff --git a/drivers/misc/liveupdate/Makefile b/drivers/misc/liveupdate/Makefile
> > index df1c9709ba4f..b4cdd162574f 100644
> > --- a/drivers/misc/liveupdate/Makefile
> > +++ b/drivers/misc/liveupdate/Makefile
> > @@ -1,3 +1,4 @@
> >  # SPDX-License-Identifier: GPL-2.0
> >  obj-y                                        += luo_core.o
> > +obj-y                                        += luo_files.o
> >  obj-y                                        += luo_subsystems.o
> > diff --git a/drivers/misc/liveupdate/luo_core.c b/drivers/misc/liveupdate/luo_core.c
> > index 417e7f6bf36c..ab1d76221fe2 100644
> > --- a/drivers/misc/liveupdate/luo_core.c
> > +++ b/drivers/misc/liveupdate/luo_core.c
> > @@ -110,6 +110,10 @@ static int luo_fdt_setup(struct kho_serialization *ser)
> >       if (ret)
> >               goto exit_free;
> >
> > +     ret = luo_files_fdt_setup(fdt_out);
> > +     if (ret)
> > +             goto exit_free;
> > +
> >       ret = luo_subsystems_fdt_setup(fdt_out);
> >       if (ret)
> >               goto exit_free;
>
> The duplication of files and subsystems does not look nice here and below.
> Can't we make files to be a subsystem?

Good idea, let me work on this.

>
> > @@ -145,7 +149,13 @@ static int luo_do_prepare_calls(void)
> >  {
> >       int ret;
> >
> > +     ret = luo_do_files_prepare_calls();
> > +     if (ret)
> > +             return ret;
> > +
> >       ret = luo_do_subsystems_prepare_calls();
> > +     if (ret)
> > +             luo_do_files_cancel_calls();
> >
> >       return ret;
> >  }
> > @@ -154,18 +164,26 @@ static int luo_do_freeze_calls(void)
> >  {
> >       int ret;
> >
> > +     ret = luo_do_files_freeze_calls();
> > +     if (ret)
> > +             return ret;
> > +
> >       ret = luo_do_subsystems_freeze_calls();
> > +     if (ret)
> > +             luo_do_files_cancel_calls();
> >
> >       return ret;
> >  }
> >
> >  static void luo_do_finish_calls(void)
> >  {
> > +     luo_do_files_finish_calls();
> >       luo_do_subsystems_finish_calls();
> >  }
> >
> >  static void luo_do_cancel_calls(void)
> >  {
> > +     luo_do_files_cancel_calls();
> >       luo_do_subsystems_cancel_calls();
> >  }
> >
> > @@ -436,6 +454,7 @@ static int __init luo_startup(void)
> >       }
> >
> >       __luo_set_state(LIVEUPDATE_STATE_UPDATED);
> > +     luo_files_startup(luo_fdt_in);
> >       luo_subsystems_startup(luo_fdt_in);
> >
> >       return 0;
> > diff --git a/drivers/misc/liveupdate/luo_files.c b/drivers/misc/liveupdate/luo_files.c
> > new file mode 100644
> > index 000000000000..953fc40db3d7
> > --- /dev/null
> > +++ b/drivers/misc/liveupdate/luo_files.c
> > @@ -0,0 +1,563 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +/*
> > + * Copyright (c) 2025, Google LLC.
> > + * Pasha Tatashin <pasha.tatashin@...een.com>
> > + */
> > +
> > +/**
> > + * DOC: LUO file descriptors
> > + *
> > + * LUO provides the infrastructure necessary to preserve
> > + * specific types of stateful file descriptors across a kernel live
> > + * update transition. The primary goal is to allow workloads, such as virtual
> > + * machines using vfio, memfd, or iommufd to retain access to their essential
> > + * resources without interruption after the underlying kernel is  updated.
> > + *
> > + * The framework operates based on handler registration and instance tracking:
> > + *
> > + * 1. Handler Registration: Kernel modules responsible for specific file
> > + * types (e.g., memfd, vfio) register a &struct liveupdate_filesystem
> > + * handler. This handler contains callbacks (&liveupdate_filesystem.prepare,
> > + * &liveupdate_filesystem.freeze, &liveupdate_filesystem.finish, etc.)
> > + * and a unique 'compatible' string identifying the file type.
> > + * Registration occurs via liveupdate_register_filesystem().
>
> I wouldn't use filesystem here, as the obvious users are not really
> filesystems. Maybe liveupdate_register_file_ops?

This corresponds to the way these structs are called in linux, so I
think the name is OK.

>
> > + *
> > + * 2. File Instance Tracking: When a potentially preservable file needs to be
> > + * managed for live update, the core LUO logic (luo_register_file()) finds a
> > + * compatible registered handler using its &liveupdate_filesystem.can_preserve
> > + * callback. If found,  an internal &struct luo_file instance is created,
> > + * assigned a unique u64 'token', and added to a list.
> > + *
> > + * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the
> > + * registered handler callbacks are invoked for each tracked file instance.
> > + * These callbacks can generate a u64 data payload representing the minimal
> > + * state needed for restoration. This payload, along with the handler's
> > + * compatible string and the unique token, is stored in a dedicated
> > + * '/file-descriptors' node within the main LUO FDT blob passed via
> > + * Kexec Handover (KHO).
> > + *
> > + * 4. Restoration: In the new kernel, the LUO framework parses the incoming
> > + * FDT to reconstruct the list of &struct luo_file instances. When the
> > + * original owner requests the file, luo_retrieve_file() uses the corresponding
> > + * handler's &liveupdate_filesystem.retrieve callback, passing the persisted
> > + * u64 data, to recreate or find the appropriate &struct file object.
> > + */
>
> The DOC is mostly about what luo_files does, we'd also need a description
> of it's intended use, both internally in the kernel and by the userspace.
>
> > +
> > +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> > +
>
> ...
>
> > +/**
> > + * luo_register_file - Register a file descriptor for live update management.
> > + * @tokenp: Return argument for the token value.
> > + * @file: Pointer to the struct file to be preserved.
> > + *
> > + * Context: Must be called when LUO is in 'normal' state.
> > + *
> > + * Return: 0 on success. Negative errno on failure.
> > + */
> > +int luo_register_file(u64 *tokenp, struct file *file)
> > +{
> > +     struct liveupdate_filesystem *fs;
> > +     bool found = false;
> > +     int ret = -ENOENT;
> > +     u64 token;
> > +
> > +     luo_state_read_enter();
> > +     if (!liveupdate_state_normal() && !liveupdate_state_updated()) {
> > +             pr_warn("File can be registered only in normal or prepared state\n");
> > +             luo_state_read_exit();
> > +             return -EBUSY;
> > +     }
> > +
> > +     down_read(&luo_filesystems_list_rwsem);
> > +     list_for_each_entry(fs, &luo_filesystems_list, list) {
> > +             if (fs->can_preserve(file, fs->arg)) {
> > +                     found = true;
> > +                     break;
> > +             }
> > +     }
> > +
> > +     if (found) {
>
>         if (!found)
>                 goto exit_unlock;

Done, thank you.


> > + * struct liveupdate_filesystem - Represents a handler for a live-updatable
> > + * filesystem/file type.
> > + * @prepare:       Optional. Saves state for a specific file instance (@file,
> > + *                 @arg) before update, potentially returning value via @data.
> > + *                 Returns 0 on success, negative errno on failure.
> > + * @freeze:        Optional. Performs final actions just before kernel
> > + *                 transition, potentially reading/updating the handle via
> > + *                 @data.
> > + *                 Returns 0 on success, negative errno on failure.
> > + * @cancel:        Optional. Cleans up state/resources if update is aborted
> > + *                 after prepare/freeze succeeded, using the @data handle (by
> > + *                 value) from the successful prepare. Returns void.
> > + * @finish:        Optional. Performs final cleanup in the new kernel using the
> > + *                 preserved @data handle (by value). Returns void.
> > + * @retrieve:      Retrieve the preserved file. Must be called before finish.
> > + * @can_preserve:  callback to determine if @file with associated context (@arg)
> > + *                 can be preserved by this handler.
> > + *                 Return bool (true if preservable, false otherwise).
> > + * @compatible:    The compatibility string (e.g., "memfd-v1", "vfiofd-v1")
> > + *                 that uniquely identifies the filesystem or file type this
> > + *                 handler supports. This is matched against the compatible
> > + *                 string associated with individual &struct liveupdate_file
> > + *                 instances.
> > + * @arg:           An opaque pointer to implementation-specific context data
> > + *                 associated with this filesystem handler registration.
> > + * @list:          used for linking this handler instance into a global list of
> > + *                 registered filesystem handlers.
> > + *
> > + * Modules that want to support live update for specific file types should
> > + * register an instance of this structure. LUO uses this registration to
> > + * determine if a given file can be preserved and to find the appropriate
> > + * operations to manage its state across the update.
> > + */
> > +struct liveupdate_filesystem {
> > +     int (*prepare)(struct file *file, void *arg, u64 *data);
> > +     int (*freeze)(struct file *file, void *arg, u64 *data);
> > +     void (*cancel)(struct file *file, void *arg, u64 data);
> > +     void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> > +     int (*retrieve)(void *arg, u64 data, struct file **file);
> > +     bool (*can_preserve)(struct file *file, void *arg);
> > +     const char *compatible;
> > +     void *arg;
> > +     struct list_head list;
> > +};
> > +
>
> Like with subsystems, I'd split ops and make the data part private to
> luo_files.c

For simplicity, I would like to keep them together, the same as in subsystems.


>
> >  /**
> >   * struct liveupdate_subsystem - Represents a subsystem participating in LUO
> >   * @prepare:      Optional. Called during LUO prepare phase. Should perform
> > @@ -142,6 +191,9 @@ int liveupdate_register_subsystem(struct liveupdate_subsystem *h);
> >  int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h);
> >  int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *data);
> >
> > +int liveupdate_register_filesystem(struct liveupdate_filesystem *h);
> > +int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h);
>
> int liveupdate_register_file_ops(name, ops, data, ret_token) ?
>
> > +
> >  #else /* CONFIG_LIVEUPDATE */
> >
> >  static inline int liveupdate_reboot(void)
> > @@ -180,5 +232,15 @@ static inline int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h,
> >       return -ENODATA;
> >  }
> >
> > +static inline int liveupdate_register_filesystem(struct liveupdate_filesystem *h)
> > +{
> > +     return 0;
> > +}
> > +
> > +static inline int liveupdate_unregister_filesystem(struct liveupdate_filesystem *h)
> > +{
> > +     return 0;
> > +}
> > +
> >  #endif /* CONFIG_LIVEUPDATE */
> >  #endif /* _LINUX_LIVEUPDATE_H */
> > --
> > 2.49.0.1101.gccaa498523-goog
> >
> >
>
> --
> Sincerely yours,
> Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ