[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRyLbB8yoQwUJ3dh@kernel.org>
Date: Tue, 18 Nov 2025 17:06:20 +0200
From: Mike Rapoport <rppt@...nel.org>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, pratyush@...nel.org,
jasonmiu@...gle.com, graf@...zon.com, dmatlack@...gle.com,
rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
vincent.guittot@...aro.org, hannes@...xchg.org,
dan.j.williams@...el.com, david@...hat.com,
joel.granados@...nel.org, rostedt@...dmis.org,
anna.schumaker@...cle.com, song@...nel.org, linux@...ssschuh.net,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, ptyadav@...zon.de, lennart@...ttering.net,
brauner@...nel.org, linux-api@...r.kernel.org,
linux-fsdevel@...r.kernel.org, saeedm@...dia.com,
ajayachandra@...dia.com, parav@...dia.com, leonro@...dia.com,
witu@...dia.com, hughd@...gle.com, skhawaja@...gle.com,
chrisl@...nel.org
Subject: Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
On Tue, Nov 18, 2025 at 10:03:00AM -0400, Jason Gunthorpe wrote:
> On Tue, Nov 18, 2025 at 01:21:34PM +0200, Mike Rapoport wrote:
> > On Mon, Nov 17, 2025 at 11:22:54PM -0500, Pasha Tatashin wrote:
> > > > You can avoid that complexity if you register the device with a different
> > > > fops, but that's technicality.
> > > >
> > > > Your point about treating the incoming FDT as an underlying resource that
> > > > failed to initialize makes sense, but nevertheless userspace needs a
> > > > reliable way to detect it and parsing dmesg is not something we should rely
> > > > on.
> > >
> > > I see two solutions:
> > >
> > > 1. LUO fails to retrieve the preserved data, the user gets informed by
> > > not finding /dev/liveupdate, and studying the dmesg for what has
> > > happened (in reality in fleets version mismatches should not be
> > > happening, those should be detected in quals).
> > > 2. Create a zombie device to return some errno on open, and still
> > > study dmesg to understand what really happened.
> >
> > User should not study dmesg. We need another solution.
> > What's wrong with e.g. ioctl()?
>
> It seems very dangerous to even boot at all if the next kernel doesn't
> understand the serialization information..
>
> IMHO I think we should not even be thinking about this, it is up to
> the predecessor environment to prevent it from happening. The ideas to
> use ELF metadata/etc to allow a pre-flight validation are the right
> solution.
>
> If we get into the next kernel and it receives information it cannot
> process it should just BUG_ON and die, or some broad equivalent.
> It is a catastrophic orchestration error, and we don't need some fine
> grain recovery or userspace visibility. Crash dump the system and
> reboot it.
I was under impression Pasha wanted to get up to the userspace no matter
what.
panic() in liveupdate_early_init() makes perfect sense to me. Parsing dmesg
does not.
> IOW, I would not invest time in this.
>
> Jason
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists