[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201110141323.GB22336@otc-nc-03>
Date: Tue, 10 Nov 2020 06:13:23 -0800
From: "Raj, Ashok" <ashok.raj@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Jason Gunthorpe <jgg@...dia.com>,
Dan Williams <dan.j.williams@...el.com>,
"Tian, Kevin" <kevin.tian@...el.com>,
"Jiang, Dave" <dave.jiang@...el.com>,
Bjorn Helgaas <helgaas@...nel.org>,
"vkoul@...nel.org" <vkoul@...nel.org>,
"Dey, Megha" <megha.dey@...el.com>,
"maz@...nel.org" <maz@...nel.org>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"Pan, Jacob jun" <jacob.jun.pan@...el.com>,
"Liu, Yi L" <yi.l.liu@...el.com>, "Lu, Baolu" <baolu.lu@...el.com>,
"Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
"Luck, Tony" <tony.luck@...el.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"parav@...lanox.com" <parav@...lanox.com>,
"rafael@...nel.org" <rafael@...nel.org>,
"netanelg@...lanox.com" <netanelg@...lanox.com>,
"shahafs@...lanox.com" <shahafs@...lanox.com>,
"yan.y.zhao@...ux.intel.com" <yan.y.zhao@...ux.intel.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"Ortiz, Samuel" <samuel.ortiz@...el.com>,
"Hossain, Mona" <mona.hossain@...el.com>,
"dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
Ashok Raj <ashok.raj@...el.com>
Subject: Re: [PATCH v4 06/17] PCI: add SIOV and IMS capability detection
Thomas,
With all these interrupt message storms ;-), I'm missing how to move towards
an end goal.
On Tue, Nov 10, 2020 at 11:27:29AM +0100, Thomas Gleixner wrote:
> Ashok,
>
> On Mon, Nov 09 2020 at 21:14, Ashok Raj wrote:
> > On Mon, Nov 09, 2020 at 11:42:29PM +0100, Thomas Gleixner wrote:
> >> On Mon, Nov 09 2020 at 13:30, Jason Gunthorpe wrote:
> > Approach to IMS is more of a phased approach.
> >
> > #1 Allow physical device to scale beyond limits of PCIe MSIx
> > Follows current methodology for guest interrupt programming and
> > evolutionary changes rather than drastic.
>
> Trapping MSI[X] writes is there because it allows to hand a device to an
> unmodified guest OS and to handle the case where the MSI[X] entries
> storage cannot be mapped exclusively to the guest.
>
> But aside of this, it's not required if the storage can be mapped
> exclusively, the guest is hypervisor aware and can get a host composed
> message via a hypercall. That works for physical functions and SRIOV,
> but not for SIOV.
It would greatly help if you can put down what you see is blocking
to move forward in the following areas.
Address Gaps in Spec:
Specs can accomodate change after review, as the number of ECN's that go on
with PCIe ;-). Please add what you like to see in the spec if you beleive
is a gap today.
Hardware Gaps?
- PASID tagged Interrupts.
- IOMMU Support for PASID based IR.
As i had called out, there are a lot of moving parts, and requires more
attention.
OS Gaps?
- Lack of ability to identify if platform can use IMS.
- Lack of hypercall.
We will always have devices that have more interrupts but their use doesn't
need IMS to be directly manipulated by the guest, or the fact those usages
require more than what is allowed by PCIe in a guest. These devices can
scale by adding another sub-device and you get another block of 2048 if needed.
This isn't just for idxd, as I mentioned earlier, there are vendors other
than Intel already working on this. In all cases the need for guest direct
manipulation of interrupt store hasn't come up. From the discussion, it
seems like there are devices today or in future that will require direct
manipulation of interrupt store in the guest. This needs additional work
in both the device hardware providing the right plumbing and OS work to
comprehend those.
Cheers,
Ashok
Powered by blists - more mailing lists