[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251002151012.GF3195829@ziepe.ca>
Date: Thu, 2 Oct 2025 12:10:12 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Pasha Tatashin <pasha.tatashin@...een.com>
Cc: Samiullah Khawaja <skhawaja@...gle.com>,
David Woodhouse <dwmw2@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, iommu@...ts.linux.dev,
YiFei Zhu <zhuyifei@...gle.com>,
Robin Murphy <robin.murphy@....com>,
Pratyush Yadav <pratyush@...nel.org>,
Kevin Tian <kevin.tian@...el.com>, linux-kernel@...r.kernel.org,
Saeed Mahameed <saeedm@...dia.com>,
Adithya Jayachandran <ajayachandra@...dia.com>,
Parav Pandit <parav@...dia.com>,
Leon Romanovsky <leonro@...dia.com>, William Tu <witu@...dia.com>,
Vipin Sharma <vipinsh@...gle.com>, dmatlack@...gle.com,
Chris Li <chrisl@...nel.org>, praan@...gle.com
Subject: Re: [RFC PATCH 13/15] iommufd: Persist iommu domains for live update
On Thu, Oct 02, 2025 at 10:43:45AM -0400, Pasha Tatashin wrote:
> > Finish on resume shouldn't indicate anything specific beyond the luo
> > should be empty and everything should have been restored. It isn't
> > like finish on pre-kexec.
> >
> > Userspace decides how it sequences things and what steps it takes
> > before ending blackout and resuming the VM.
>
> This is a fair statement: userspace knows when vCPUs are resumed and
> can decide when to do the HWPT swap. Following that logic, what if we
> provide a specific ioctl() to perform the swap?
Yeah, that is what I've been talking about. The ioctl already exists
in iommufd..
> What do we do if the user reclaimed iommufd but did not swap HWPT or
> did not perform some other ioctl() before finish(), simply print a
> kernel warnings and let it be, or force swapping during finish before
> going into normal mode?
The problem we haven't discussed how to solve is the linkage between
the iommu_domain and the memfd.
Since the preserved iommu_domain is referring to memory owned by the
memfd and the pins don't get restored until the iommufd starts and
generates new pins. Thus we need to keep the memfd in a frozen state.
Maybe that is the real use case for finish - things like memfd remain
frozen until finish concludes.
However, keeping with the keep it simple theme, finish can just not
succeed if there are stray objects that userspace has not cleaned up
floating around. Eg a simple refcount and iommu_domain decrs it when
it is destroyed.
Jason
Powered by blists - more mailing lists