[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276D9387CB50D58E4A7585F8CBA2@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Fri, 9 Aug 2024 08:00:34 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Nicolin Chen <nicolinc@...dia.com>, Robin Murphy <robin.murphy@....com>
CC: "jgg@...dia.com" <jgg@...dia.com>, "joro@...tes.org" <joro@...tes.org>,
"will@...nel.org" <will@...nel.org>, "shuah@...nel.org" <shuah@...nel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>
Subject: RE: [PATCH v2 2/3] iommu/dma: Support MSIs through nested domains
> From: Nicolin Chen <nicolinc@...dia.com>
> Sent: Friday, August 9, 2024 7:00 AM
>
> On Thu, Aug 08, 2024 at 01:38:44PM +0100, Robin Murphy wrote:
> > On 06/08/2024 9:25 am, Tian, Kevin wrote:
> > > > From: Nicolin Chen <nicolinc@...dia.com>
> > > > Sent: Saturday, August 3, 2024 8:32 AM
> > > >
> > > > From: Robin Murphy <robin.murphy@....com>
> > > >
> > > > Currently, iommu-dma is the only place outside of IOMMUFD and
> drivers
> > > > which might need to be aware of the stage 2 domain encapsulated
> within
> > > > a nested domain. This would be in the legacy-VFIO-style case where
> we're
> > >
> > > why is it a legacy-VFIO-style? We only support nested in IOMMUFD.
> >
> > Because with proper nesting we ideally shouldn't need the host-managed
> > MSI mess at all, which all stems from the old VFIO paradigm of
> > completely abstracting interrupts from userspace. I'm still hoping
> > IOMMUFD can grow its own interface for efficient MSI passthrough, where
> > the VMM can simply map the physical MSI doorbell into whatever IPA (GPA)
> > it wants it to appear at in the S2 domain, then whatever the guest does
> > with S1 it can program the MSI address into the endpoint accordingly
> > without us having to fiddle with it.
>
> Hmm, until now I wasn't so convinced myself that it could work as I
> was worried about the data. But having a second thought, since the
> host configures the MSI, it can still set the correct data. What we
> only need is to change the MSI address from a RMRed IPA/gIOVA to a
> real gIOVA of the vITS page.
>
> I did a quick hack to test that loop. MSI in the guest still works
> fine without having the RMR node in its IORT. Sweet!
>
> To go further on this path, we will need the following changes:
> - MSI configuration in the host (via a VFIO_IRQ_SET_ACTION_TRIGGER
> hypercall) should set gIOVA instead of fetching from msi_cookie.
> That hypercall doesn't forward an address currently, since host
> kernel pre-sets the msi_cookie. So, we need a way to forward the
> gIOVA to kernel and pack it into the msi_msg structure. I haven't
> read the VFIO PCI code thoroughly, yet wonder if we could just
> let the guest program the gIOVA to the PCI register and fall it
> through to the hardware, so host kernel handling that hypercall
> can just read it back from the register?
> - IOMMUFD should provide VMM a way to tell the gPA (or directly +
> GITS_TRANSLATER?). Then kernel should do the stage-2 mapping. I
> have talked to Jason about this a while ago, and we have a few
> thoughts how to implement it. But eventually, I think we still
> can't avoid a middle man like msi_cookie to associate the gPA in
> IOMMUFD to PA in irqchip?
Probably a new IOMMU_DMA_MSI_COOKIE_USER type which uses
GPA (passed in in ALLOC_HWPT for a nested_parent type) as IOVA
in iommu_dma_get_msi_page()?
>
> One more concern is the MSI window size. VMM sets up a MSI region
> that must fit the hardware window size. Most of ITS versions have
> only one page size but one of them can have multiple pages? What
> if vITS is one-page size while the underlying pITS has multiple?
>
> My understanding of the current kernel-defined 1MB size is also a
> hard-coding window to potential fit all cases, since IOMMU code in
> the code can just eyeball what's going on in the irqchip subsystem
> and adjust accordingly if someday it needs to. But VMM can't?
>
> Thanks
> Nicolin
Powered by blists - more mailing lists