lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAywjhQGQx2_2X8r0rf3AgMDbJj-9C=9_1a3xgiLwuzKLAvXCQ@mail.gmail.com>
Date: Thu, 2 Oct 2025 12:29:25 -0700
From: Samiullah Khawaja <skhawaja@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, David Woodhouse <dwmw2@...radead.org>, 
	Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>, 
	Will Deacon <will@...nel.org>, iommu@...ts.linux.dev, YiFei Zhu <zhuyifei@...gle.com>, 
	Robin Murphy <robin.murphy@....com>, Pratyush Yadav <pratyush@...nel.org>, 
	Kevin Tian <kevin.tian@...el.com>, linux-kernel@...r.kernel.org, 
	Saeed Mahameed <saeedm@...dia.com>, Adithya Jayachandran <ajayachandra@...dia.com>, 
	Parav Pandit <parav@...dia.com>, Leon Romanovsky <leonro@...dia.com>, William Tu <witu@...dia.com>, 
	Vipin Sharma <vipinsh@...gle.com>, dmatlack@...gle.com, Chris Li <chrisl@...nel.org>, 
	praan@...gle.com
Subject: Re: [RFC PATCH 13/15] iommufd: Persist iommu domains for live update

On Thu, Oct 2, 2025 at 8:10 AM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Thu, Oct 02, 2025 at 10:43:45AM -0400, Pasha Tatashin wrote:
> > > Finish on resume shouldn't indicate anything specific beyond the luo
> > > should be empty and everything should have been restored. It isn't
> > > like finish on pre-kexec.
> > >
> > > Userspace decides how it sequences things and what steps it takes
> > > before ending blackout and resuming the VM.
> >
> > This is a fair statement: userspace knows when vCPUs are resumed and
> > can decide when to do the HWPT swap. Following that logic, what if we
> > provide a specific ioctl() to perform the swap?
>
> Yeah, that is what I've been talking about. The ioctl already exists
> in iommufd..

Yes, I agree. We can use the existing ioctl and the hotswap happens
when userspace attaches the new HWPT to the device. That has been my
understanding as well.

Userspace should indeed have full autonomy to perform the hotswap
whenever the VMM (and HWPT) is ready.
>
> > What do we do if the user reclaimed iommufd but did not swap HWPT or
> > did not perform some other ioctl() before finish(), simply print a
> > kernel warnings and let it be, or force swapping during finish before
> > going into normal mode?
>
> The problem we haven't discussed how to solve is the linkage between
> the iommu_domain and the memfd.
>
> Since the preserved iommu_domain is referring to memory owned by the
> memfd and the pins don't get restored until the iommufd starts and
> generates new pins. Thus we need to keep the memfd in a frozen state.

Yes, there are dependencies between preserved FDs, and we need to
consider them during LUO PREPARE (preservation). We can use an LUO
helper in the can_preserve callback to check if a dependency is also
going to be preserved. I discuss the restore part below.
>
> Maybe that is the real use case for finish - things like memfd remain
> frozen until finish concludes.

Yes, for memfd LUO file_handler, maybe that is the purpose of FINISH.

But that gets us into the discussion of whether a dependency is
ready/allowed to mutate and FINISH. How would LUO file_handler of a
dependency know that it is safe to mutate/finish? Maybe LUO calls the
iommufd FINISH first and if it fails the dependencies don't get a
FINISH call.

I had a quick discussion with Pasha to see how LUO can help with FD
dependencies and FINISH order. Perhaps we need a new LUO API that
iommufd can call before live update, explicitly telling LUO that it
depends on an FD that is going to be preserved.

>
> However, keeping with the keep it simple theme, finish can just not
> succeed if there are stray objects that userspace has not cleaned up
> floating around. Eg a simple refcount and iommu_domain decrs it when
> it is destroyed.
>
> Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ