[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251124190818.GI153257@nvidia.com>
Date: Mon, 24 Nov 2025 15:08:18 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Mike Rapoport <rppt@...nel.org>, pratyush@...nel.org,
jasonmiu@...gle.com, graf@...zon.com, dmatlack@...gle.com,
rientjes@...gle.com, corbet@....net, rdunlap@...radead.org,
ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com,
ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org,
akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr,
mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com,
axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com,
vincent.guittot@...aro.org, hannes@...xchg.org,
dan.j.williams@...el.com, david@...hat.com,
joel.granados@...nel.org, rostedt@...dmis.org,
anna.schumaker@...cle.com, song@...nel.org, linux@...ssschuh.net,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
myungjoo.ham@...sung.com, yesanishhere@...il.com,
Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
aleksander.lobakin@...el.com, ira.weiny@...el.com,
andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
stuart.w.hayes@...il.com, ptyadav@...zon.de, lennart@...ttering.net,
brauner@...nel.org, linux-api@...r.kernel.org,
linux-fsdevel@...r.kernel.org, saeedm@...dia.com,
ajayachandra@...dia.com, parav@...dia.com, leonro@...dia.com,
witu@...dia.com, hughd@...gle.com, skhawaja@...gle.com,
chrisl@...nel.org
Subject: Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO
On Tue, Nov 18, 2025 at 10:03:03PM -0500, Pasha Tatashin wrote:
> On Tue, Nov 18, 2025 at 6:25 PM Jason Gunthorpe <jgg@...dia.com> wrote:
> >
> > On Tue, Nov 18, 2025 at 05:07:15PM -0500, Pasha Tatashin wrote:
> >
> > > In this case, we cannot even rely on having "safe" memory, i.e. this
> > > scratch only boot to preserve dmesg/core etc, this is unfortunate. Is
> > > there a way to avoid defaulting to identify mode when we are booting
> > > into the "maintenance" mode?
> >
> > Maybe one could be created?
> >
> > It's tricky though because you also really want to block drivers from
> > using the iommu if you don't know they are quieted and you can't do
> > that without parsing the KHO data, which you can't do because it
> > doesn't understand it..
> >
> > IDK, I think the "maintenance" mode is something that is probably best
> > effort and shouldn't be relied on. It will work if the iommu data is
> > restored or other lucky conditions hit, so it is not useless, but it
> > is certainly not robust or guaranteed.
>
> Right, even kdump has always been best-effort; many types of crashes
> do not make it to the crash kernel.
>
> > You are better to squirt a panic message out of the serial port and
>
> For early boot LUO mismatches, or if FLB data is inaccessible for any
> reason, devices might go rogue, so triggering a panic during boot is
> appropriate.
>
> However, session and file data structures are deserialized later, when
> /dev/liveupdate is first opened by userspace. If deserialization fails
> at that stage, I think we should simply fail the open(/dev/liveupdate)
> call with an error such as -EIO.
That seems reasonable, if you reached this point then it is probably
OK.
Most likely the prior kernel should mark some critical things like kho,
iommu and pci data as 'madatory early boot' and if the new kernel
doesn't use them then blow up right away.
Jason
Powered by blists - more mailing lists