lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <MWHPR11MB16451C94F2A2E7D57AEFE6E68C2E0@MWHPR11MB1645.namprd11.prod.outlook.com>
Date:   Tue, 1 Sep 2020 09:45:29 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Niklas Schnelle <schnelle@...ux.ibm.com>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Alex Williamson <alex.williamson@...hat.com>
CC:     Matthew Rosato <mjrosato@...ux.ibm.com>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "pmorel@...ux.ibm.com" <pmorel@...ux.ibm.com>,
        "mpe@...erman.id.au" <mpe@...erman.id.au>,
        "oohall@...il.com" <oohall@...il.com>,
        "linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Pan, Jacob jun" <jacob.jun.pan@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>
Subject: RE: [PATCH v3] PCI: Introduce flag for detached virtual functions

> From: kvm-owner@...r.kernel.org <kvm-owner@...r.kernel.org> On Behalf
> Of Niklas Schnelle
> Sent: Friday, August 28, 2020 5:10 PM
> To: Bjorn Helgaas <helgaas@...nel.org>; Alex Williamson
> <alex.williamson@...hat.com>
> 
[...]
> >>
> >> FWIW, pci_physfn() never returns NULL, it returns the provided pdev if
> >> is_virtfn is not set.  This proposal wouldn't change that return value.
> >> AIUI pci_physfn(), the caller needs to test that the returned device is
> >> different from the provided device if there's really code that wants to
> >> traverse to the PF.
> >
> > Oh, so this VF has is_virtfn==0.  That seems weird.  There are lots of
> > other ways that a VF is different: Vendor/Device IDs are 0xffff, BARs
> > are zeroes, etc.
> >
> > It sounds like you're sweeping those under the rug by avoiding the
> > normal enumeration path (e.g., you don't have to size the BARs), but
> > if it actually is a VF, it seems like there might be fewer surprises
> > if we treat it as one.
> >
> > Why don't you just set is_virtfn=1 since it *is* a VF, and then deal
> > with the special cases where you want to touch the PF?
> >
> > Bjorn
> >
> 
> As we are always running under at least a machine level hypervisor
> we're somewhat in the same situation as e.g. a KVM guest in
> that the VFs we see have some emulation that makes them act more like
> normal PCI functions. It just so happens that the machine level hypervisor
> does not emulate the PCI_COMMAND_MEMORY, it does emulate BARs and
> Vendor/Device IDs
> though.
> So is_virtfn is 0 for some VF for the same reason it is 0 on
> KVM/ESXi/HyperV/Jailhouse…
> guests on other architectures.

I wonder whether it's a good idea to also find a way to set is_virtfn
for normal KVM guest which get a vf assigned. There are other cases 
where faithful emulation of certain PCI capabilities is difficult, e.g. 
when enabling guest SVA related features (PASID/ATS/PRS). Per PCIe 
spec, some or all fields of those capabilities are shared between PF 
and VF. Among them:

1) Some could be emulated properly and indirectly reflected in hardware, 
e.g. Intel VT-d allows additional control per VF about whether to accept 
page request, execute/privileged permission, etc. thus allowing VF-specific 
control even when device-side setting is shared;
	
2) Some could be purely emulated in software and it's harmless to leave
the hardware following PF setting, e.g. ATS enable, STU(?), outstanding
page request allocation, etc.;

3) However, I didn’t see a clean way of emulating page_request_ctrl.reset
and page_request_status.stopped. Those two have clear definition about
outstanding page requests. They are shared thus we cannot issue physical
action just due to request on one VF, while pure software emulation 
cannot guarantee the desired expectation. Of course this issue also exists
even on bare metal - pci_enable/disable/reset_pri just do nothing for
vf. But there is chance to mitigate (e.g. timeout), but not possible in guest
if the guest doesn't know it's actually a VF.

Setting is_virtfn=1 allows guest to be cooperative like running together
with PF driver. But there is an ordering issue. The guest knows whether
a device is VF only when the VF driver is loaded (based on PCI_ID), but
related capabilities might be already enabled when attaching the device
to IOMMU (at least for intel_iommu). But suppose it's not a hard fix.
Last, detached vf is not a PCISIG definition. So the host still needs to
do proper emulation (even not faithful) of those capabilities for guests 
who don't recognize detached vf.

Thoughts?

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ