[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210112073427.GE4678@unreal>
Date: Tue, 12 Jan 2021 09:34:27 +0200
From: Leon Romanovsky <leon@...nel.org>
To: "Tian, Kevin" <kevin.tian@...el.com>
Cc: Lu Baolu <baolu.lu@...ux.intel.com>,
Jason Gunthorpe <jgg@...dia.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"Raj, Ashok" <ashok.raj@...el.com>,
"Jiang, Dave" <dave.jiang@...el.com>,
"Dey, Megha" <megha.dey@...el.com>,
"dwmw2@...radead.org" <dwmw2@...radead.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"dmaengine@...r.kernel.org" <dmaengine@...r.kernel.org>,
"eric.auger@...hat.com" <eric.auger@...hat.com>,
"Pan, Jacob jun" <jacob.jun.pan@...el.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"maz@...nel.org" <maz@...nel.org>,
"Hossain, Mona" <mona.hossain@...el.com>,
"netanelg@...lanox.com" <netanelg@...lanox.com>,
"parav@...lanox.com" <parav@...lanox.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"rafael@...nel.org" <rafael@...nel.org>,
"Ortiz, Samuel" <samuel.ortiz@...el.com>,
"Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
"shahafs@...lanox.com" <shahafs@...lanox.com>,
"Luck, Tony" <tony.luck@...el.com>,
"vkoul@...nel.org" <vkoul@...nel.org>,
"yan.y.zhao@...ux.intel.com" <yan.y.zhao@...ux.intel.com>,
"Liu, Yi L" <yi.l.liu@...el.com>
Subject: Re: [RFC PATCH v2 1/1] platform-msi: Add platform check for
subdevice irq domain
On Tue, Jan 12, 2021 at 06:59:35AM +0000, Tian, Kevin wrote:
> > From: Leon Romanovsky <leon@...nel.org>
> > Sent: Tuesday, January 12, 2021 1:53 PM
> >
> > On Tue, Jan 12, 2021 at 01:17:11PM +0800, Lu Baolu wrote:
> > > Hi,
> > >
> > > On 1/7/21 3:16 PM, Leon Romanovsky wrote:
> > > > On Thu, Jan 07, 2021 at 06:55:16AM +0000, Tian, Kevin wrote:
> > > > > > From: Leon Romanovsky <leon@...nel.org>
> > > > > > Sent: Thursday, January 7, 2021 2:09 PM
> > > > > >
> > > > > > On Thu, Jan 07, 2021 at 02:04:29AM +0000, Tian, Kevin wrote:
> > > > > > > > From: Leon Romanovsky <leon@...nel.org>
> > > > > > > > Sent: Thursday, January 7, 2021 12:02 AM
> > > > > > > >
> > > > > > > > On Wed, Jan 06, 2021 at 11:23:39AM -0400, Jason Gunthorpe
> > wrote:
> > > > > > > > > On Wed, Jan 06, 2021 at 12:40:17PM +0200, Leon Romanovsky
> > wrote:
> > > > > > > > >
> > > > > > > > > > I asked what will you do when QEMU will gain needed
> > functionality?
> > > > > > > > > > Will you remove QEMU from this list? If yes, how such "new"
> > kernel
> > > > > > will
> > > > > > > > > > work on old QEMU versions?
> > > > > > > > >
> > > > > > > > > The needed functionality is some VMM hypercall, so presumably
> > new
> > > > > > > > > kernels that support calling this hypercall will be able to discover
> > > > > > > > > if the VMM hypercall exists and if so superceed this entire check.
> > > > > > > >
> > > > > > > > Let's not speculate, do we have well-known path?
> > > > > > > > Will such patch be taken to stable@...stros?
> > > > > > > >
> > > > > > >
> > > > > > > There are two functions introduced in this patch. One is to detect
> > whether
> > > > > > > running on bare metal or in a virtual machine. The other is for
> > deciding
> > > > > > > whether the platform supports ims. Currently the two are identical
> > because
> > > > > > > ims is supported only on bare metal at current stage. In the future it
> > will
> > > > > > look
> > > > > > > like below when ims can be enabled in a VM:
> > > > > > >
> > > > > > > bool arch_support_pci_device_ims(struct pci_dev *pdev)
> > > > > > > {
> > > > > > > return on_bare_metal() ||
> > hypercall_irq_domain_supported();
> > > > > > > }
> > > > > > >
> > > > > > > The VMM vendor list is for on_bare_metal, and suppose a vendor
> > will
> > > > > > > never be removed once being added to the list since the fact of
> > running
> > > > > > > in a VM never changes, regardless of whether this hypervisor
> > supports
> > > > > > > extra VMM hypercalls.
> > > > > >
> > > > > > This is what I imagined, this list will be forever, and this worries me.
> > > > > >
> > > > > > I don't know if it is true or not, but guess that at least Oracle and
> > > > > > Microsoft bare metal devices and VMs will have same
> > DMI_SYS_VENDOR.
> > > > >
> > > > > It's true. David Woodhouse also said it's the case for Amazon EC2
> > instances.
> > > > >
> > > > > >
> > > > > > It means that this on_bare_metal() function won't work reliably in
> > many
> > > > > > cases. Also being part of include/linux/msi.h, at some point of time,
> > > > > > this function will be picked by the users outside for the non-IMS cases.
> > > > > >
> > > > > > I didn't even mention custom forks of QEMU which are prohibited to
> > change
> > > > > > DMI_SYS_VENDOR and private clouds with custom solutions.
> > > > >
> > > > > In this case the private QEMU forks are encouraged to set CPUID (X86_
> > > > > FEATURE_HYPERVISOR) if they do plan to adopt a different vendor
> > name.
> > > >
> > > > Does QEMU set this bit when it runs in host-passthrough CPU model?
> > > >
> > > > >
> > > > > >
> > > > > > The current array makes DMI_SYS_VENDOR interface as some sort of
> > ABI. If
> > > > > > in the future,
> > > > > > the QEMU will decide to use more hipster name, for example "qEmU",
> > this
> > > > > > function
> > > > > > won't work.
> > > > > >
> > > > > > I'm aware that DMI_SYS_VENDOR is used heavily in the kernel code
> > and
> > > > > > various names for the same company are good example how not
> > reliable it.
> > > > > >
> > > > > > The most hilarious example is "Dell/Dell Inc./Dell Inc/Dell Computer
> > > > > > Corporation/Dell Computer",
> > > > > > but other companies are not far from them.
> > > > > >
> > > > > > Luckily enough, this identification is used for hardware product that
> > > > > > was released to the market and their name will be stable for that
> > > > > > specific model. It is not the case here where we need to ensure future
> > > > > > compatibility too (old kernel on new VM emulator).
> > > > > >
> > > > > > I'm not in position to say yes or no to this patch and don't have plans
> > to do it.
> > > > > > Just expressing my feeling that this solution is too hacky for my taste.
> > > > > >
> > > > >
> > > > > I agree with your worries and solely relying on DMI_SYS_VENDOR is
> > > > > definitely too hacky. In previous discussions with Thomas there is no
> > > > > elegant way to handle this situation. It has to be a heuristic approach.
> > > > > First we hope the CPUID bit is set properly in most cases thus is checked
> > > > > first. Then other heuristics can be made for the remaining cases. DMI_
> > > > > SYS_VENDOR is the first hint and more can be added later. For example,
> > > > > when IOMMU is present there is vendor specific way to detect whether
> > > > > it's real or virtual. Dave also mentioned some BIOS flag to indicate a
> > > > > virtual machine. Now probably the real question here is whether people
> > > > > are OK with CPUID+DMI_SYS_VENDOR combo check for now (and grow
> > > > > it later) or prefer to having all identified heuristics so far in-place
> > together...
> > > >
> > > > IMHO, it should be as much as possible close to the end result.
> > >
> > > Okay! This seems to be a right way to go.
> > >
> > > The SMBIOS defines a 'virtual machine' bit in the BIOS characteristics
> > > extension byte. It could be used as a possible way.
> > >
> > > In order to support emulated IOMMU for fully virtualized guest, the
> > > iommu vendors defined methods to distinguish between bare metal and
> > VMM
> > > (caching mode in VT-d for example).
> > >
> > > I will go ahead with adding above two methods before checking the block
> > > list.
> >
> > I still curious to hear an answer on my question above:
> > "Does QEMU set this bit when it runs in host-passthrough CPU model?"
>
> Yes, the bit is also set in this model.
Great, thanks.
>
> Thanks
> Kevin
Powered by blists - more mailing lists