[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACePvbWw9G=y_cycWFMXxRbmuAE8yFCM0Z3y=Ojw30ENDkDL-g@mail.gmail.com>
Date: Thu, 2 Oct 2025 15:30:24 -0700
From: Chris Li <chrisl@...nel.org>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, Jason Gunthorpe <jgg@...pe.ca>,
Bjorn Helgaas <bhelgaas@...gle.com>, "Rafael J. Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>, Len Brown <lenb@...nel.org>, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, linux-acpi@...r.kernel.org,
David Matlack <dmatlack@...gle.com>, Pasha Tatashin <tatashin@...gle.com>,
Jason Miu <jasonmiu@...gle.com>, Vipin Sharma <vipinsh@...gle.com>,
Saeed Mahameed <saeedm@...dia.com>, Adithya Jayachandran <ajayachandra@...dia.com>,
Parav Pandit <parav@...dia.com>, William Tu <witu@...dia.com>, Mike Rapoport <rppt@...nel.org>,
Leon Romanovsky <leon@...nel.org>
Subject: Re: [PATCH v2 06/10] PCI/LUO: Save and restore driver name
On Wed, Oct 1, 2025 at 11:09 PM Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
> Just keeping a device "alive" while rebooting into the same exact kernel
> image seems odd to me given that this is almost never what people
> actually do. They update their kernel with the weekly stable release to
> get the new bugfixes (remember we fix 13 CVEs a day), and away you go.
> You are saying that this workload would not actually be supported, so
> why do you want live update at all? Who needs this?
I saw Pasha reply to a lot of your questions. I can take a stab on who
needs it. Others feel free to add/correct me. The major cloud vendor
(you know who is the usual suspect) providing GPU to the VM will want
it. The usage case is that the VM is controlled by the customer. The
cloud provider has a contract on how many maintenance downtimes to the
VM. Let's say X second maintenance downtime per year. When upgrading
the host kernel, typically the VM can be migrated to another host
without much interruption, so it does not take much from the down time
budget. However when you have a GPU attached to the VM, the GPU is
running some ML jobs, there is no good way to migrate that GPU context
to another machine. Instead, we can do a liveupdate from the host
kernel. During the liveupdate, the old kernel saves the liveupdate
state. VM is paused to memory while the GPU as a PCI device is kept on
running. ML jobs are still up. The kernel liveupdate kexec to the
new kernel version. Restore and reconstruct the software side of the
device state. VM re-attached to the file descriptor to get the
previous context. In the end the VM can resume running with the new
kernel while the GPU keeps running the ML job. From the VM point of
view, there are Y seconds the VM does not respond during the kexec.
The GPU did not lose the context and VM did not reboot. The benefit is
that Y second is much smaller than the time to reboot the VM and
restart the GPU ML jobs. So that Y can fit into the X second
maintenance downtime per year in the service contract.
Hope that explanation makes sense to you.
Chris
Powered by blists - more mailing lists