[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXkIGfos4l0kv_lF@skinsburskii.localdomain>
Date: Tue, 27 Jan 2026 10:46:49 -0800
From: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
To: Mukesh R <mrathor@...ux.microsoft.com>
Cc: linux-kernel@...r.kernel.org, linux-hyperv@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, iommu@...ts.linux.dev,
linux-pci@...r.kernel.org, linux-arch@...r.kernel.org,
kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
decui@...rosoft.com, longli@...rosoft.com, catalin.marinas@....com,
will@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, joro@...tes.org,
lpieralisi@...nel.org, kwilczynski@...nel.org, mani@...nel.org,
robh@...nel.org, bhelgaas@...gle.com, arnd@...db.de,
nunodasneves@...ux.microsoft.com, mhklinux@...look.com,
romank@...ux.microsoft.com
Subject: Re: [PATCH v0 12/15] x86/hyperv: Implement hyperv virtual iommu
On Mon, Jan 26, 2026 at 07:02:29PM -0800, Mukesh R wrote:
> On 1/26/26 07:57, Stanislav Kinsburskii wrote:
> > On Fri, Jan 23, 2026 at 05:26:19PM -0800, Mukesh R wrote:
> > > On 1/20/26 16:12, Stanislav Kinsburskii wrote:
> > > > On Mon, Jan 19, 2026 at 10:42:27PM -0800, Mukesh R wrote:
> > > > > From: Mukesh Rathor <mrathor@...ux.microsoft.com>
> > > > >
> > > > > Add a new file to implement management of device domains, mapping and
> > > > > unmapping of iommu memory, and other iommu_ops to fit within the VFIO
> > > > > framework for PCI passthru on Hyper-V running Linux as root or L1VH
> > > > > parent. This also implements direct attach mechanism for PCI passthru,
> > > > > and it is also made to work within the VFIO framework.
> > > > >
> > > > > At a high level, during boot the hypervisor creates a default identity
> > > > > domain and attaches all devices to it. This nicely maps to Linux iommu
> > > > > subsystem IOMMU_DOMAIN_IDENTITY domain. As a result, Linux does not
> > > > > need to explicitly ask Hyper-V to attach devices and do maps/unmaps
> > > > > during boot. As mentioned previously, Hyper-V supports two ways to do
> > > > > PCI passthru:
> > > > >
> > > > > 1. Device Domain: root must create a device domain in the hypervisor,
> > > > > and do map/unmap hypercalls for mapping and unmapping guest RAM.
> > > > > All hypervisor communications use device id of type PCI for
> > > > > identifying and referencing the device.
> > > > >
> > > > > 2. Direct Attach: the hypervisor will simply use the guest's HW
> > > > > page table for mappings, thus the host need not do map/unmap
> > > > > device memory hypercalls. As such, direct attach passthru setup
> > > > > during guest boot is extremely fast. A direct attached device
> > > > > must be referenced via logical device id and not via the PCI
> > > > > device id.
> > > > >
> > > > > At present, L1VH root/parent only supports direct attaches. Also direct
> > > > > attach is default in non-L1VH cases because there are some significant
> > > > > performance issues with device domain implementation currently for guests
> > > > > with higher RAM (say more than 8GB), and that unfortunately cannot be
> > > > > addressed in the short term.
> > > > >
> > > >
> > > > <snip>
> > > >
> >
> > <snip>
> >
> > > > > +static void hv_iommu_detach_dev(struct iommu_domain *immdom, struct device *dev)
> > > > > +{
> > > > > + struct pci_dev *pdev;
> > > > > + struct hv_domain *hvdom = to_hv_domain(immdom);
> > > > > +
> > > > > + /* See the attach function, only PCI devices for now */
> > > > > + if (!dev_is_pci(dev))
> > > > > + return;
> > > > > +
> > > > > + if (hvdom->num_attchd == 0)
> > > > > + pr_warn("Hyper-V: num_attchd is zero (%s)\n", dev_name(dev));
> > > > > +
> > > > > + pdev = to_pci_dev(dev);
> > > > > +
> > > > > + if (hvdom->attached_dom) {
> > > > > + hv_iommu_det_dev_from_guest(hvdom, pdev);
> > > > > +
> > > > > + /* Do not reset attached_dom, hv_iommu_unmap_pages happens
> > > > > + * next.
> > > > > + */
> > > > > + } else {
> > > > > + hv_iommu_det_dev_from_dom(hvdom, pdev);
> > > > > + }
> > > > > +
> > > > > + hvdom->num_attchd--;
> > > >
> > > > Shouldn't this be modified iff the detach succeeded?
> > >
> > > We want to still free the domain and not let it get stuck. The purpose
> > > is more to make sure detach was called before domain free.
> > >
> >
> > How can one debug subseqent errors if num_attchd is decremented
> > unconditionally? In reality the device is left attached, but the related
> > kernel metadata is gone.
>
> Error is printed in case of failed detach. If there is panic, at least
> you can get some info about the device. Metadata in hypervisor is
> around if failed.
>
With this approach the only thing left is a kernel message.
But if the state is kept intact, one could collect a kernel core and
analyze it.
And note, that there won't be a hypervisor core by default: our main
context with the usptreamed version of the driver is L1VH and a kernel
core is the only thing a third party customer can provide for our
analysis.
Thanks,
Stanislav
Powered by blists - more mailing lists