[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260113092940.000050b8@linux.microsoft.com>
Date: Tue, 13 Jan 2026 09:29:40 -0800
From: Jacob Pan <jacob.pan@...ux.microsoft.com>
To: Michael Kelley <mhklinux@...look.com>
Cc: Yu Zhang <zhangyu1@...ux.microsoft.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "kys@...rosoft.com" <kys@...rosoft.com>,
"haiyangz@...rosoft.com" <haiyangz@...rosoft.com>, "wei.liu@...nel.org"
<wei.liu@...nel.org>, "decui@...rosoft.com" <decui@...rosoft.com>,
"lpieralisi@...nel.org" <lpieralisi@...nel.org>, "kwilczynski@...nel.org"
<kwilczynski@...nel.org>, "mani@...nel.org" <mani@...nel.org>,
"robh@...nel.org" <robh@...nel.org>, "bhelgaas@...gle.com"
<bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>, "joro@...tes.org"
<joro@...tes.org>, "will@...nel.org" <will@...nel.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"easwar.hariharan@...ux.microsoft.com"
<easwar.hariharan@...ux.microsoft.com>, "nunodasneves@...ux.microsoft.com"
<nunodasneves@...ux.microsoft.com>, "mrathor@...ux.microsoft.com"
<mrathor@...ux.microsoft.com>, "peterz@...radead.org"
<peterz@...radead.org>, "linux-arch@...r.kernel.org"
<linux-arch@...r.kernel.org>
Subject: Re: [RFC v1 5/5] iommu/hyperv: Add para-virtualized IOMMU support
for Hyper-V guest
Hi Michael,
On Mon, 12 Jan 2026 17:48:30 +0000
Michael Kelley <mhklinux@...look.com> wrote:
> From: Michael Kelley <mhklinux@...look.com>
> To: Yu Zhang <zhangyu1@...ux.microsoft.com>
> CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
> "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
> "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
> "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
> "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
> <haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
> "decui@...rosoft.com" <decui@...rosoft.com>, "lpieralisi@...nel.org"
> <lpieralisi@...nel.org>, "kwilczynski@...nel.org"
> <kwilczynski@...nel.org>, "mani@...nel.org" <mani@...nel.org>,
> "robh@...nel.org" <robh@...nel.org>, "bhelgaas@...gle.com"
> <bhelgaas@...gle.com>, "arnd@...db.de" <arnd@...db.de>,
> "joro@...tes.org" <joro@...tes.org>, "will@...nel.org"
> <will@...nel.org>, "robin.murphy@....com" <robin.murphy@....com>,
> "easwar.hariharan@...ux.microsoft.com"
> <easwar.hariharan@...ux.microsoft.com>,
> "jacob.pan@...ux.microsoft.com" <jacob.pan@...ux.microsoft.com>,
> "nunodasneves@...ux.microsoft.com"
> <nunodasneves@...ux.microsoft.com>, "mrathor@...ux.microsoft.com"
> <mrathor@...ux.microsoft.com>, "peterz@...radead.org"
> <peterz@...radead.org>, "linux-arch@...r.kernel.org"
> <linux-arch@...r.kernel.org> Subject: RE: [RFC v1 5/5] iommu/hyperv:
> Add para-virtualized IOMMU support for Hyper-V guest Date: Mon, 12
> Jan 2026 17:48:30 +0000
>
> From: Yu Zhang <zhangyu1@...ux.microsoft.com> Sent: Monday, January
> 12, 2026 8:56 AM
> >
> > On Thu, Jan 08, 2026 at 06:48:59PM +0000, Michael Kelley wrote:
> > > From: Yu Zhang <zhangyu1@...ux.microsoft.com> Sent: Monday,
> > > December 8, 2025 9:11 PM
> >
> > <snip>
> > Thank you so much, Michael, for the thorough review!
> >
> > I've snipped some comments I fully agree with and will address in
> > next version. Actually, I have to admit I agree with your remaining
> > comments below as well. :)
> >
> > > > +struct hv_iommu_dev *hv_iommu_device;
> > > > +static struct hv_iommu_domain hv_identity_domain;
> > > > +static struct hv_iommu_domain hv_blocking_domain;
> > >
> > > Why is hv_iommu_device allocated dynamically while the two
> > > domains are allocated statically? Seems like the approach could
> > > be consistent, though maybe there's some reason I'm missing.
> > >
> >
> > On second thought, `hv_identity_domain` and `hv_blocking_domain`
> > should likely be allocated dynamically as well, consistent with
> > `hv_iommu_device`.
>
> I don't know if there's a strong rationale either way (static
> allocation vs. dynamic). If the long-term expectation is that there
> is never more than one PV IOMMU in a guest, then static would be OK.
> If future direction allows that there could be multiple PV IOMMUs in
> a guest, then doing dynamic from the start is justifiable (though the
> current PV IOMMU hypercalls seem to assume only one PV IOMMU). But
> either way, being consistent is desirable.
>
I believe we only need a single global static identity domain here
regardless how many vIOMMUs there may be. From the guest’s perspective,
the hvIOMMU only supports hardware‑passthrough identity domains, which
do not maintain any per‑IOMMU state, i.e., there is no S1 IO page table
based identity domain.
The expectation of physical IOMMU settings for guest identity
domain should be as follows:
- Intel vtd PASID entry PGTT = 010b (Second-stage Translation only)
- AMD DTE TV=1; GV=0
> >
> > <snip>
> >
> > > > +static void hv_iommu_shutdown(void)
> > > > +{
> > > > + iommu_device_sysfs_remove(&hv_iommu_device->iommu);
> > > > +
> > > > + kfree(hv_iommu_device);
> > > > +}
> > > > +
> > > > +static struct syscore_ops hv_iommu_syscore_ops = {
> > > > + .shutdown = hv_iommu_shutdown,
> > > > +};
> [...]
> >
> > For iommu_device_sysfs_remove(), I guess they are not necessary, and
> > I will need to do some homework to better understand the sysfs. :)
> > Originally, we wanted a shutdown routine to trigger some hypercall,
> > so that Hyper-V will disable the DMA translation, e.g., during the
> > VM reboot process.
>
> I would presume that if Hyper-V reboots the VM, Hyper-V automatically
> resets the PV IOMMU and prevents any further DMA operations. But
> consider kexec(), where a new kernel gets loaded without going through
> the hypervisor "reboot-this-VM" path. There have been problems in the
> past with kexec() where parts of Hyper-V state for the guest didn't
> get reset, and the PV IOMMU is likely something in that category. So
> there may indeed be a need to tell the hypervisor to reset everything
> related to the PV IOMMU. There are already functions to do Hyper-V
> cleanup: see vmbus_initiate_unload() and hyperv_cleanup(). These
> existing functions may be a better place to do PV IOMMU cleanup/reset
> if needed.
That would be my vote also.
Powered by blists - more mailing lists