[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276599F467AD5EAC935A79E8C719@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Fri, 10 Dec 2021 07:36:12 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>,
Thomas Gleixner <tglx@...utronix.de>
CC: "Jiang, Dave" <dave.jiang@...el.com>,
Logan Gunthorpe <logang@...tatee.com>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Marc Zygnier <maz@...nel.org>,
Alex Williamson <alex.williamson@...hat.com>,
"Dey, Megha" <megha.dey@...el.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jon Mason <jdmason@...zu.us>, Allen Hubbe <allenbh@...il.com>,
"linux-ntb@...glegroups.com" <linux-ntb@...glegroups.com>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
Heiko Carstens <hca@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
"x86@...nel.org" <x86@...nel.org>, Joerg Roedel <jroedel@...e.de>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>
Subject: RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()
> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Friday, December 10, 2021 4:59 AM
>
> On Thu, Dec 09, 2021 at 09:32:42PM +0100, Thomas Gleixner wrote:
> > On Thu, Dec 09 2021 at 12:21, Jason Gunthorpe wrote:
> > > On Thu, Dec 09, 2021 at 09:37:06AM +0100, Thomas Gleixner wrote:
> > > If we keep the MSI emulation in the hypervisor then MSI != IMS. The
> > > MSI code needs to write a addr/data pair compatible with the emulation
> > > and the IMS code needs to write an addr/data pair from the
> > > hypercall. Seems like this scenario is best avoided!
> > >
> > > From this perspective I haven't connected how virtual interrupt
> > > remapping helps in the guest? Is this a way to provide the hypercall
> > > I'm imagining above?
> >
> > That was my thought to avoid having different mechanisms.
> >
> > The address/data pair is computed in two places:
> >
> > 1) Activation of an interrupt
> > 2) Affinity setting on an interrupt
> >
> > Both configure the IRTE when interrupt remapping is in place.
> >
> > In both cases a vector is allocated in the vector domain and based on
> > the resulting target APIC / vector number pair the IRTE is
> > (re)configured.
> >
> > So putting the hypercall into the vIRTE update is the obvious
> > place. Both activation and affinity setting can fail and propagate an
> > error code down to the originating caller.
> >
> > Hmm?
>
> Okay, I think I get it. Would be nice to have someone from intel
> familiar with the vIOMMU protocols and qemu code remark what the
> hypervisor side can look like.
>
> There is a bit more work here, we'd have to change VFIO to somehow
> entirely disconnect the kernel IRQ logic from the MSI table and
> directly pass control of it to the guest after the hypervisor IOMMU IR
> secures it. ie directly mmap the msi-x table into the guest
>
It's supported already:
/*
* The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
* which allows direct access to non-MSIX registers which happened to be within
* the same system page.
*
* Even though the userspace gets direct access to the MSIX data, the existing
* VFIO_DEVICE_SET_IRQS interface must still be used for MSIX configuration.
*/
#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE 3
IIRC this was introduced for PPC when a device has MSI-X in the same BAR as
other MMIO registers. Trapping MSI-X leads to performance downgrade on
accesses to adjacent registers. MSI-X can be mapped by userspace because
PPC already uses a hypercall mechanism for interrupt. Though unclear about
the detail it sounds a similar usage as proposed here.
Thanks
Kevin
Powered by blists - more mailing lists