[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250331163734.GA266513@nvidia.com>
Date: Mon, 31 Mar 2025 13:37:34 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: changyuanl@...gle.com, graf@...zon.com, rppt@...nel.org,
rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
vincent.guittot@...aro.org, hannes@...xchg.org,
dan.j.williams@...el.com, david@...hat.com,
joel.granados@...nel.org, rostedt@...dmis.org,
anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn,
linux@...ssschuh.net, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
gregkh@...uxfoundation.org, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, jgowans@...zon.com,
Pratyush Yadav <ptyadav@...zon.de>
Subject: Re: [RFC v1 1/3] luo: Live Update Orchestrator
On Thu, Mar 27, 2025 at 03:29:18PM -0400, Pasha Tatashin wrote:
> Here’s a summary of the planned approach:
>
> 1. Unified Location: LUO will be moved under misc/liveupdate/ to house
> the consolidated functionality.
It make sense to me, and I prefer all this live update stuff be as
isolated and "side car" as possible to keep the normal kernel flow
simple..
> 2. User Interfaces: A primary character device (/dev/liveupdate)
> utilizing an ioctl interface for control operations. (An initial draft
> of this interface is available here:
> https://raw.githubusercontent.com/soleen/linux/refs/heads/luo/rfc-v2.1/include/uapi/linux/liveupdate.h)
That looks like a pretty comprehensive view
I'd probably nitpick some things but nothing fundamental..
You *may* want to look at drivers/fwctl/main.c around fwctl_fops_ioctl
for some thoughts on how to structure an ioctl implementation to be
safely extensible. You can even just copy that stuff, I copied it
already from iommufd..
Little confusing how you imagine to use UNPRESERVE_XX, EVENT_CANCEL
and close() as various error handling strategies? Especially depending
on how we are able to "freeze" a file descriptor.
> An optional sysfs interface will allow userspace applications to
> monitor the LUO's state and react appropriately. e.g. allows SystemD
> to load different services during different live update states.
Make sense, systemd works alot better with a sysfs file for knowing if
the boot is a kexec live update boot or not.
Though I don't know why you'd keep /sys/kernel/liveupdate/prepare and
others ? It seems really weird that something would be able to safely
sequence the update but not have access to the FD?
> 3. Dependency Management: The viability of preserving a specific
> resource (file, device) will be checked when it initially requests
> participation.
> However, the actual dependencies will only be pulled and the final
> ordered list assembled during the prepare phase. This avoids the churn
> of repeatedly adding/removing dependencies as individual components
> register.
Maybe, will have to see how the code works out in practice with real
implementations. I did not imagine having a full "unprepare" idea
since that significantly complicates everything. close() would just
nuke everything.
> struct liveupdate_fs_handle {
> struct list_head liveupdate_entry;
Don't mix data and const function pointers..
> int (*prepare)(struct file *filp, void *preserve_page, ...); // Callback during prepare phase
> int (*reboot)(struct file *filp, void *preserve_page,...); // Callback during reboot phase
> void (*finish)(struct file *filp, void *preserve_page,...); // Callback after successful update to do state clean-up
> void (*cancel)(struct file *filp, void *preserve_page,...); // Callback if prepare/reboot is cancelled
> };
But it makes sense over all
> Preserved File Descriptors (e.g., memfd, kvmfd, iommufd, vfiofd)
> Preserved Devices (ordered appropriately, leaves-to-root)
I think because of the cyclic ordering between kvm/iommu/vfio it may
become a bit complicated. You will want LIVEUPDATE_IOCTL_FD_PRESERVE
to not check dependencies but leave some kind of placeholder so the
cycles can be broken.
> Global State Components
You may need a LIVEUPDATE_IOCTL_GLOBAL_PRESERVE as well to select
these?
Jason
Powered by blists - more mailing lists