lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 May 2018 10:34:29 -0400
From:   Don Dutile <ddutile@...hat.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Bjorn Helgaas <helgaas@...nel.org>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
        netdev@...r.kernel.org, Sathya Perla <sathya.perla@...adcom.com>,
        Felix Manlunas <felix.manlunas@...iumnetworks.com>,
        alexander.duyck@...il.com, john.fastabend@...il.com,
        Jacob Keller <jacob.e.keller@...el.com>,
        oss-drivers@...ronome.com, Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH] PCI: allow drivers to limit the number of VFs to 0

On 05/25/2018 05:05 PM, Jakub Kicinski wrote:
> On Fri, 25 May 2018 09:02:23 -0500, Bjorn Helgaas wrote:
>> On Thu, May 24, 2018 at 06:20:15PM -0700, Jakub Kicinski wrote:
>>> On Thu, 24 May 2018 18:57:48 -0500, Bjorn Helgaas wrote:
>>>> On Mon, Apr 02, 2018 at 03:46:52PM -0700, Jakub Kicinski wrote:
>>>>> Some user space depends on enabling sriov_totalvfs number of VFs
>>>>> to not fail, e.g.:
>>>>>
>>>>> $ cat .../sriov_totalvfs > .../sriov_numvfs
>>>>>
>>>>> For devices which VF support depends on loaded FW we have the
>>>>> pci_sriov_{g,s}et_totalvfs() API.  However, this API uses 0 as
>>>>> a special "unset" value, meaning drivers can't limit sriov_totalvfs
>>>>> to 0.  Remove the special values completely and simply initialize
>>>>> driver_max_VFs to total_VFs.  Then always use driver_max_VFs.
>>>>> Add a helper for drivers to reset the VF limit back to total.
>>>>
>>>> I still can't really make sense out of the changelog.
>>>>
>>>> I think part of the reason it's confusing is because there are two
>>>> things going on:
>>>>
>>>>    1) You want this:
>>>>    
>>>>         pci_sriov_set_totalvfs(dev, 0);
>>>>         x = pci_sriov_get_totalvfs(dev)
>>>>
>>>>       to return 0 instead of total_VFs.  That seems to connect with
>>>>       your subject line.  It means "sriov_totalvfs" in sysfs could be
>>>>       0, but I don't know how that is useful (I'm sure it is; just
>>>>       educate me :))
>>>
>>> Let me just quote the bug report that got filed on our internal bug
>>> tracker :)
>>>
>>>    When testing Juju Openstack with Ubuntu 18.04, enabling SR-IOV causes
>>>    errors because Juju gets the sriov_totalvfs for SR-IOV-capable device
>>>    then tries to set that as the sriov_numvfs parameter.
>>>
>>>    For SR-IOV incapable FW, the sriov_totalvfs parameter should be 0,
>>>    but it's set to max.  When FW is switched to flower*, the correct
>>>    sriov_totalvfs value is presented.
>>>
>>> * flower is a project name
>>
>>  From the point of view of the PCI core (which knows nothing about
>> device firmware and relies on the architected config space described
>> by the PCIe spec), this sounds like an erratum: with some firmware
>> installed, the device is not capable of SR-IOV, but still advertises
>> an SR-IOV capability with "TotalVFs > 0".
>>
>> Regardless of whether that's an erratum, we do allow PF drivers to use
>> pci_sriov_set_totalvfs() to limit the number of VFs that may be
>> enabled by writing to the PF's "sriov_numvfs" sysfs file.
> 
> Think more of an FPGA which can be reprogrammed at runtime to have
> different capabilities than an erratum.  Some FWs simply have no use
> for VFs and save resources (and validation time) by not supporting it.
> 
Sure, then the steps should be:
a) (re-)program FPGA
b) invoke hot-plug for new device.
    -- by default, VFs aren't configured(enabled) in a Linux kernel;
       -- some drivers provide boot-time enablement, but that becomes
          system-wide, and can cause major config issues when multiples of a device are installed in the system.
       -- otherwise, configure via sysfs
    -- this should clear/reset the VF values too.

>> But the current implementation does not allow a PF driver to limit VFs
>> to 0, and that does seem nonsensical.
>>
>>> My understanding is OpenStack uses sriov_totalvfs to determine how many
>>> VFs can be enabled, looks like this is the code:
>>>
>>> http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n464
>>>    
>>>>    2) You're adding the pci_sriov_reset_totalvfs() interface.  I'm not
>>>>       sure what you intend for this.  Is *every* driver supposed to
>>>>       call it in .remove()?  Could/should this be done in the core
>>>>       somehow instead of depending on every driver?
>>>
>>> Good question, I was just thinking yesterday we may want to call it
>>> from the core, but I don't think it's strictly necessary nor always
>>> sufficient (we may reload FW without re-probing).
>>>
>>> We have a device which supports different number of VFs based on the FW
>>> loaded.  Some legacy FWs does not inform the driver how many VFs it can
>>> support, because it supports max.  So the flow in our driver is this:
>>>
>>> load_fw(dev);
>>> ...
>>> max_vfs = ask_fw_for_max_vfs(dev);
>>> if (max_vfs >= 0)
>>> 	return pci_sriov_set_totalvfs(dev, max_vfs);
>>> else /* FW didn't tell us, assume max */
>>> 	return pci_sriov_reset_totalvfs(dev);
>>>
>>> We also reset the max on device remove, but that's not strictly
>>> necessary.
>>>
>>> Other users of pci_sriov_set_totalvfs() always know the value to set
>>> the total to (either always get it from FW or it's a constant).
>>>
>>> If you prefer we can work out the correct max for those legacy cases in
>>> the driver as well, although it seemed cleaner to just ask the core,
>>> since it already has total_VFs value handy :)
>>>    
>>>> I'm also having a hard time connecting your user-space command example
>>>> with the rest of this.  Maybe it will make more sense to me tomorrow
>>>> after some coffee.
>>>
>>> OpenStack assumes it will always be able to set sriov_numvfs to
>>> sriov_totalvfs, see this 'if':
>>>
>>> http://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/tree/hooks/neutron_ovs_utils.py#n512
>>
>> Thanks for educating me.  I think there are two issues here that we
>> can separate.  I extracted the patch below for the first.
>>
>> The second is the question of resetting driver_max_VFs.  I think we
>> currently have a general issue in the core:
>>
>>    - load PF driver 1
>>    - driver calls pci_sriov_set_totalvfs() to reduce driver_max_VFs
>>    - unload PF driver 1
>>    - load PF driver 2
>>
>> Now driver_max_VFs is still stuck at the lower value set by driver 1.
>> I don't think that's the way this should work.
>>
>> I guess this is partly a consequence of setting driver_max_VFs in
>> sriov_init(), which is called before driver attach and should only
>> depend on hardware characteristics, so it is related to the patch
>> below.  But I think we should fix it in general, not just for
>> netronome.
> 
> Okay, perfect.  That makes sense.  The patch below certainly fixes the
> first issue for us.  Thank you!
> 
> As far as the second issue goes - agreed, having the core reset the
> number of VFs to total_VFs definitely makes sense.  It doesn't cater to
> the case where FW is reloaded without reprobing, but we don't do this
> today anyway.
> 
> Should I try to come up with a patch to reset total_VFs after detach?
> 
>> commit 4a338bc6f94b9ad824ac944f5dfc249d6838719c
>> Author: Jakub Kicinski <jakub.kicinski@...ronome.com>
>> Date:   Fri May 25 08:18:34 2018 -0500
>>
>>      PCI/IOV: Allow PF drivers to limit total_VFs to 0
>>      
>>      Some SR-IOV PF drivers implement .sriov_configure(), which allows
>>      user-space to enable VFs by writing the desired number of VFs to the sysfs
>>      "sriov_numvfs" file (see sriov_numvfs_store()).
>>      
>>      The PCI core limits the number of VFs to the TotalVFs advertised by the
>>      device in its SR-IOV capability.  The PF driver can limit the number of VFs
>>      to even fewer (it may have pre-allocated data structures or knowledge of
>>      device limitations) by calling pci_sriov_set_totalvfs(), but previously it
>>      could not limit the VFs to 0.
>>      
>>      Change pci_sriov_get_totalvfs() so it always respects the VF limit imposed
>>      by the PF driver, even if the limit is 0.
>>      
>>      This sequence:
>>      
>>        pci_sriov_set_totalvfs(dev, 0);
>>        x = pci_sriov_get_totalvfs(dev);
>>      
>>      previously set "x" to TotalVFs from the SR-IOV capability.  Now it will set
>>      "x" to 0.
>>      
>>      Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
>>      Signed-off-by: Bjorn Helgaas <bhelgaas@...gle.com>
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index 192b82898a38..d0d73dbbd5ca 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -469,6 +469,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
>>   	iov->nres = nres;
>>   	iov->ctrl = ctrl;
>>   	iov->total_VFs = total;
>> +	iov->driver_max_VFs = total;
>>   	pci_read_config_word(dev, pos + PCI_SRIOV_VF_DID, &iov->vf_device);
>>   	iov->pgsz = pgsz;
>>   	iov->self = dev;
>> @@ -827,10 +828,7 @@ int pci_sriov_get_totalvfs(struct pci_dev *dev)
>>   	if (!dev->is_physfn)
>>   		return 0;
>>   
>> -	if (dev->sriov->driver_max_VFs)
>> -		return dev->sriov->driver_max_VFs;
>> -
>> -	return dev->sriov->total_VFs;
>> +	return dev->sriov->driver_max_VFs;
>>   }
>>   EXPORT_SYMBOL_GPL(pci_sriov_get_totalvfs);
>>   
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ