[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276AD88A1D1A1AA313E228C8C729@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Sat, 11 Dec 2021 08:06:36 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Jason Gunthorpe <jgg@...dia.com>
CC: "Jiang, Dave" <dave.jiang@...el.com>,
Logan Gunthorpe <logang@...tatee.com>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <helgaas@...nel.org>,
Marc Zygnier <maz@...nel.org>,
Alex Williamson <alex.williamson@...hat.com>,
"Dey, Megha" <megha.dey@...el.com>,
"Raj, Ashok" <ashok.raj@...el.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jon Mason <jdmason@...zu.us>, Allen Hubbe <allenbh@...il.com>,
"linux-ntb@...glegroups.com" <linux-ntb@...glegroups.com>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
Heiko Carstens <hca@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ibm.com>,
"x86@...nel.org" <x86@...nel.org>, Joerg Roedel <jroedel@...e.de>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>
Subject: RE: [patch 21/32] NTB/msi: Convert to msi_on_each_desc()
> From: Thomas Gleixner <tglx@...utronix.de>
> Sent: Friday, December 10, 2021 8:13 PM
>
> >> 5) It's not possible for the kernel to reliably detect whether it is
> >> running on bare metal or not. Yes we talked about heuristics, but
> >> that's something I really want to avoid.
> >
> > How would the hypercall mechanism avoid such heuristics?
>
> The availability of IR remapping where the irqdomain which is provided
> by the remapping unit signals that it supports this new scheme:
>
> |--IO/APIC
> |--MSI
> vector -- IR --|--MSI-X
> |--IMS
>
> while the current scheme is:
>
> |--IO/APIC
> vector -- IR --|--PCI/MSI[-X]
>
> or
>
> |--IO/APIC
> vector --------|--PCI/MSI[-X]
>
> So in the new scheme the IR domain will advertise new features which are
> not available on older kernels. The availability of these new features
> is the indicator for the interrupt subsystem and subsequently for PCI
> whether IMS is supported or not.
>
> Bootup either finds an IR unit or not. In the bare metal case that's the
> usual hardware/firmware detection. In the guest case it's the
> availability of vIR including the required hypercall protocol.
Given we have vIR already, there are three scenarios:
1) Bare metal: IR (no hypercall, for sure)
2) VM: vIR (no hypercall, today)
3) VM: vIR (hypercall, tomorrow)
IMS should be allowed only for 1) and 3).
But how to differentiate 2) from 1) if no guest heuristics?
btw I checked Qemu history to find vIR was introduced in 2016:
commit 1121e0afdcfa0cd40e36bd3acff56a3fac4f70fd
Author: Peter Xu <peterx@...hat.com>
Date: Thu Jul 14 13:56:13 2016 +0800
x86-iommu: introduce "intremap" property
Adding one property for intel-iommu devices to specify whether we should
support interrupt remapping. By default, IR is disabled. To enable it,
we should use (take Intel IOMMU as example):
-device intel_iommu,intremap=on
This property can be shared by Intel and future AMD IOMMUs.
Signed-off-by: Peter Xu <peterx@...hat.com>
Reviewed-by: Michael S. Tsirkin <mst@...hat.com>
Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
>
> > Then Qemu needs to find out the GSI number for the vIRTE handle.
> > Again Qemu doesn't have such information since it doesn't know
> > which MSI[-X] entry points to this handle due to no trap.
> >
> > This implies that we may also need carry device ID, #msi entry, etc.
> > in the hypercall, so Qemu can associate the virtual routing info
> > to the right [irqfd, gsi].
> >
> > In your model the hypercall is raised by IR domain. Do you see
> > any problem of finding those information within IR domain?
>
> IR has the following information available:
>
> Interrupt type
> - MSI: Device, index and number of vectors
> - MSI-X: Device, index
> - IMS: Device, index
>
> Target APIC/vector pair
>
> IMS: The index depends on the storage type:
>
> For storage in device memory, e.g. IDXD, it's the array index.
>
> For storage in system memory, the index is a software artifact.
>
> Does that answer your question?
>
Yes.
Thanks
Kevin
Powered by blists - more mailing lists