lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251020142924.GS316284@nvidia.com>
Date: Mon, 20 Oct 2025 11:29:24 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, jasonmiu@...gle.com,
	graf@...zon.com, changyuanl@...gle.com, rppt@...nel.org,
	dmatlack@...gle.com, rientjes@...gle.com, corbet@....net,
	rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com,
	kanie@...ux.alibaba.com, ojeda@...nel.org, aliceryhl@...gle.com,
	masahiroy@...nel.org, akpm@...ux-foundation.org, tj@...nel.org,
	yoann.congal@...le.fr, mmaurer@...gle.com, roman.gushchin@...ux.dev,
	chenridong@...wei.com, axboe@...nel.dk, mark.rutland@....com,
	jannh@...gle.com, vincent.guittot@...aro.org, hannes@...xchg.org,
	dan.j.williams@...el.com, david@...hat.com,
	joel.granados@...nel.org, rostedt@...dmis.org,
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn,
	linux@...ssschuh.net, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-mm@...ck.org,
	gregkh@...uxfoundation.org, tglx@...utronix.de, mingo@...hat.com,
	bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org,
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org,
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com,
	myungjoo.ham@...sung.com, yesanishhere@...il.com,
	Jonathan.Cameron@...wei.com, quic_zijuhu@...cinc.com,
	aleksander.lobakin@...el.com, ira.weiny@...el.com,
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de,
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com,
	stuart.w.hayes@...il.com, lennart@...ttering.net,
	brauner@...nel.org, linux-api@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, saeedm@...dia.com,
	ajayachandra@...dia.com, parav@...dia.com, leonro@...dia.com,
	witu@...dia.com, hughd@...gle.com, skhawaja@...gle.com,
	chrisl@...nel.org, steven.sistare@...cle.com
Subject: Re: [PATCH v4 00/30] Live Update Orchestrator

On Tue, Oct 14, 2025 at 03:29:59PM +0200, Pratyush Yadav wrote:
> > 1) Use a vmalloc and store a list of the PFNs in the pool. Pool becomes
> >    frozen, can't add/remove PFNs.
> 
> Doesn't that circumvent LUO's state machine? The idea with the state
> machine was to have clear points in time when the system goes into the
> "limited capacity"/"frozen" state, which is the LIVEUPDATE_PREPARE
> event. 

I wouldn't get too invested in the FSM, it is there but it doesn't
mean every luo client has to be focused on it.

> With what you propose, the first FD being preserved implicitly
> triggers the prepare event. Same thing for unprepare/cancel operations.

Yes, this is easy to write and simple to manage.

> I am wondering if it is better to do it the other way round: prepare all
> files first, and then prepare the hugetlb subsystem at
> LIVEUPDATE_PREPARE event. At that point it already knows which pages to
> mark preserved so the serialization can be done in one go.

I think this would be slower and more complex?

> > 2) Require the users of hugetlb memory, like memfd, to
> >    preserve/restore the folios they are using (using their hugetlb order)
> > 3) Just before kexec run over the PFN list and mark a bit if the folio
> >    was preserved by KHO or not. Make sure everything gets KHO
> >    preserved.
> 
> "just before kexec" would need a callback from LUO. I suppose a
> subsystem is the place for that callback. I wrote my email under the
> (wrong) impression that we were replacing subsystems.

The file descriptors path should have luo client ops that have all
the required callbacks. This is probably an existing op.

> That makes me wonder: how is the subsystem-level callback supposed to
> access the global data? I suppose it can use the liveupdate_file_handler
> directly, but it is kind of strange since technically the subsystem and
> file handler are two different entities.

If we need such things we would need a way to link these together, but
I'm wonder if we really don't..

> Also as Pasha mentioned, 1G pages for guest_memfd will use hugetlb, and
> I'm not sure how that would map with this shared global data. memfd and
> guest_memfd will likely have different liveupdate_file_handler but would
> share data from the same subsystem. Maybe that's a problem to solve for
> later...

On preserve memfd should call into hugetlb to activate it as a hugetlb
page provider and preserve it too.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ