lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bA0acjg-CEKufERu_ov4up3E4XTkJ6kbEDCny0iASrFVQ@mail.gmail.com>
Date: Wed, 1 Oct 2025 17:03:19 -0400
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Chris Li <chrisl@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>, Bjorn Helgaas <bhelgaas@...gle.com>, 
	"Rafael J. Wysocki" <rafael@...nel.org>, Danilo Krummrich <dakr@...nel.org>, Len Brown <lenb@...nel.org>, 
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org, 
	linux-acpi@...r.kernel.org, David Matlack <dmatlack@...gle.com>, 
	Pasha Tatashin <tatashin@...gle.com>, Jason Miu <jasonmiu@...gle.com>, 
	Vipin Sharma <vipinsh@...gle.com>, Saeed Mahameed <saeedm@...dia.com>, 
	Adithya Jayachandran <ajayachandra@...dia.com>, Parav Pandit <parav@...dia.com>, William Tu <witu@...dia.com>, 
	Mike Rapoport <rppt@...nel.org>, Leon Romanovsky <leon@...nel.org>
Subject: Re: [PATCH v2 06/10] PCI/LUO: Save and restore driver name

Hi Greg,

On Wed, Oct 1, 2025 at 1:06 AM Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
>
> On Tue, Sep 30, 2025 at 11:56:58AM -0400, Pasha Tatashin wrote:
> > > > A driver that preserves state across a reboot already has an implicit
> > > > contract with its future self about that data's format. The GUID
> > > > simply makes that contract explicit and machine-checkable. It does not
> > > > have to be GUID, but nevertheless there has to be a specific contract.
> > >
> > > So how are you going to "version" these GUID?  I see you use "schema Vx"
> >
> > Driver developer who changes a driver to support live-update.
>
> I do not understand this response, sorry.

Sorry for the confusion, I misunderstood your question. I thought you
were asking who would add a new field to a driver. My answer was that
it would be the developer who is adding support for the Live Update
feature to that specific driver.
I now realize you were asking about how the GUID would be versioned.
Using a GUID was just one of several ideas. My main point is that we
need some form of versioned compatibility identifier, whether it's a
string or a number. This would allow the system to verify that the new
driver can understand the preserved data for this device from the
previous kernel before it binds to the device.

> > > above, but how is that really going to work in the end?  Lots of data
> > > structures change underneath the base driver that it knows nothing
> > > about, not to mention basic things like compiler flags and the like
> > > (think about how we have changed things for spectre issues over the
> > > years...)
> >
> > We are working on versioning protocol, the GUID I am suggesting is not
> > to protect "struct" coherency, but just to identify which driver to
> > bind to which device compatability.
>
> So you have a new way of matching drivers to devices?  That's odd.

Correct. For a device that persists across a live update, the driver
matching logic in the new kernel would need to be altered

Unless, the device can stay unbound into initramfs, as Jason suggested
earlier in the thread. But, still probing would need to be altered to
keep the device unbound.

> > > And when can you delete an old "schema"?  This feels like you are
> > > forcing future developers to maintain things "for forever"...
> >
> > This won't be an issue because of how live update support is planned.
> > The support model will be phased and limited:
> >
> > Initially, and for a while there will be no stability guarantees
> > between different kernel versions.
> > Eventually, we will support specific, narrow upgrade paths (e.g.,
> > minor-to-minor, or stable-A to stable-A+1).
> > Downgrades and arbitrary version jumps ("any-to-any") will not be
> > supported upstream. Since we only ever need to handle a well-defined
> > forward path, the code for old, irrelevant schemas can always be
> > removed. There is no "forever".
>
> This is kernel code, it is always "forever", sorry.

I'm sorry, but I don't quite understand what you mean. There is no
stable internal kernel API; the upstream tree is constantly evolving
with features being added, improved, and removed.

> If you want "minor to minor" update, how is that going to work given
> that you do not add changes only to "minor" releases (that being the
> 6.12.y the "y" number).

You are correct. Initially, our plan is to allow live updates to break
between any kernel version. However, it is my hope that we will
eventually stabilize this process and only allow breakages between,
for example, versions 6.n and 6.n+2, and eventually from one stable
release to stable+2. This would create a well-defined window for
safely removing deprecated data formats and the code that handles them
from the kernel.

> Remember, Linux does not use "semantic versioning" as its release
> numbering is older than that scheme.  It just does "this version is
> newer than that version" and that's it.  You can't really take anything
> else from the number.

Understood. If that's the case, we could use stable releases as the
basis for defining when a live update can break. It would take longer
to achieve, but it is a possibility. These are the kinds of questions
that will be discussed at the LPC Liveupdate MC. If you are attending
LPC, I encourage you to join the discussion, as your thoughts on how
we can frame long-term live update support would be very valuable.

> And if this isn't for "upstream" at all, then why have it?  We can't add
> new features and support it if we can't actually use it and it's only
> for out-of-tree vendor kernels.

Our goal is to have full support in the upstream kernel. Downstream
users will then need to adapt live updates to their specific needs.
For example, if a live update from version A to version C is broken, a
downstream user would either have to update incrementally from A to B
and then to C, or they would have to internally fix whatever is
causing the breakage before performing the live update.

> And how will you document properly a "well defined forward path"?  That
> should be done first, before you have any code here that we are
> reviewing.

Currently, and for the near future, live updates will only be
supported within the same kernel version.

> Please do that, get people to agree on the idea and how it will work
> before asking us to review code.

This is an industry-wide effort. We have engineers from Amazon,
Google, Microsoft, Nvidia, and other companies meeting bi-weekly to
discuss Live Update support, and sending and landing patches upstream.
We are also organizing an LPC Live Update Micro Conference where the
versioning strategy will be a topic.

For now, we have agreed that the live update can break between and
kernel versions or with any commit while the feature is under active
development. This approach allows us the flexibility to build the core
functionality while we collaboratively define the long-term versioning
and stability model.

Thank you,
Pasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ