[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvZKfUQpiv33MQw+@Asurada-Nvidia>
Date: Thu, 26 Sep 2024 23:02:37 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Yi Liu <yi.l.liu@...el.com>
CC: <jgg@...dia.com>, <kevin.tian@...el.com>, <will@...nel.org>,
<joro@...tes.org>, <suravee.suthikulpanit@....com>, <robin.murphy@....com>,
<dwmw2@...radead.org>, <baolu.lu@...ux.intel.com>, <shuah@...nel.org>,
<linux-kernel@...r.kernel.org>, <iommu@...ts.linux.dev>,
<linux-arm-kernel@...ts.infradead.org>, <linux-kselftest@...r.kernel.org>,
<eric.auger@...hat.com>, <jean-philippe@...aro.org>, <mdf@...nel.org>,
<mshavit@...gle.com>, <shameerali.kolothum.thodi@...wei.com>,
<smostafa@...gle.com>
Subject: Re: [PATCH v2 04/19] iommufd: Allow pt_id to carry viommu_id for
IOMMU_HWPT_ALLOC
On Fri, Sep 27, 2024 at 01:38:08PM +0800, Yi Liu wrote:
> > > Does it mean each vIOMMU of VM can only have
> > > one s2 HWPT?
> >
> > Giving some examples here:
> > - If a VM has 1 vIOMMU, there will be 1 vIOMMU object in the
> > kernel holding one S2 HWPT.
> > - If a VM has 2 vIOMMUs, there will be 2 vIOMMU objects in the
> > kernel that can hold two different S2 HWPTs, or share one S2
> > HWPT (saving memory).
>
> So if you have two devices assigned to a VM, then you may have two
> vIOMMUs or one vIOMMU exposed to guest. This depends on whether the two
> devices are behind the same physical IOMMU. If it's two vIOMMUs, the two
> can share the s2 hwpt if their physical IOMMU is compatible. is it?
Yes.
> To achieve the above, you need to know if the physical IOMMUs of the
> assigned devices, hence be able to tell if physical IOMMUs are the
> same and if they are compatible. How would userspace know such infos?
My draft implementation with QEMU does something like this:
- List all viommu-matched iommu nodes under /sys/class/iommu: LINKs
- Get PCI device's /sys/bus/pci/devices/0000:00:00.0/iommu: LINK0
- Compare the LINK0 against the LINKs
We so far don't have an ID for physical IOMMU instance, which can
be an alternative to return via the hw_info call, otherwise.
QEMU then does the routing to assign PCI buses and IORT (or DT).
This part is suggested now to move to libvirt though. So, I think
at the end of the day, libvirt would run the sys check and assign
a device to the corresponding pci bus backed by the correct IOMMU.
This gives an example showing two devices behind iommu0 and third
device behind iommu1 are assigned to a VM:
-device pxb-pcie.id=pcie.viommu0,bus=pcie.0.... \ # bus for viommu0
-device pxb-pcie.id=pcie.viommu1,bus=pcie.0.... \ # bus for viommu1
-device pcie-root-port,id=pcie.viommu0p0,bus=pcie.viommu0... \
-device pcie-root-port,id=pcie.viommu0p1,bus=pcie.viommu0... \
-device pcie-root-port,id=pcie.viommu1p0,bus=pcie.viommu1... \
-device vfio-pci,bus=pcie.viommu0p0... \ # connect to bus for viommu0
-device vfio-pci,bus=pcie.viommu0p1... \ # connect to bus for viommu0
-device vfio-pci,bus=pcie.viommu1p0... # connect to bus for viommu1
For compatibility to share a stage-2 HWPT, basically we would do
a device attach to one of the stage-2 HWPT from the list that VMM
should keep. This attach has all the compatibility test, down to
the IOMMU driver. If it fails, just allocate a new stage-2 HWPT.
Thanks
Nic
Powered by blists - more mailing lists