lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151016165627.GA52728@bhshelto-vm>
Date:	Fri, 16 Oct 2015 11:56:28 -0500
From:	Ben Shelton <benjamin.h.shelton@...el.com>
To:	Bjorn Helgaas <helgaas@...nel.org>
Cc:	Alexander Duyck <alexander.duyck@...il.com>, bhelgaas@...gle.com,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] PCI: IOV: read SRIOV_NUM_VF after enabling ARI

Hi Bjorn,

> What problem does this patch solve, Ben?  I assume you have devices
> that do change TotalVFs when ARI is enabled, and you do want the new
> value?
> 
> Or is the problem something like the following:
> 
>   - ...
>   - Linux PCI core sees TotalVFs = X (saved as iov->total_VFs)
>   - Linux sets ARI Capable Hierarchy
>   - Device changes TotalVFs to X + Y (but PCI core doesn't notice)
>   - Driver reads TotalVFs and sees X + Y
>   - Driver attempts pci_enable_sriov(dev, X + Y), which fails because
>     sriov_enable() sees "X + Y > iov->total_VFs"

Here's a short snippet from the databook for the PCI Express controller we're
using:

"Supports two sets of VF Stride, First VF Offset, InitialVFs, and TotalVFs
registers per PF—one each for ARI and non-ARI hierarchies. Selection is
performed by host software through the ARI Capable Hierarchy bit of the Control
register in the PF0 SR-IOV capability structure."

The values in InitialVFs and TotalVFs are HWinit for each set of registers.

So the issue this is intended to fix is the following:

- Linux PCI core sees TotalVFs = X (saved as iov->total_VFs).
- Linux sets ARI Capable Hierarchy.
- Device switches to exposing the second set of registers, where
  InitialVFs = TotalVFs = Y (where Y > X).
- User enables one or more VFs on the device, e.g. by writing a value to
  sriov_numvfs in the sysfs.
- Driver calls pci_enable_sriov() for the device, which then calls
  sriov_enable().  sriov_enable() reads InitialVFs (= Y) and then checks if it's
  greater than iov->total_VFs (= X).  Since Y > X, the comparison is true, so
  sriov_enable() fails out and returns -EIO.

> 
> I'm a little dubious about drivers reading the SRIOV capability
> directly, so maybe this is a symptom of deeper problems.

I agree that the driver should not be reading the capability directly, but from
what I understand, it's intended for the device itself to do this.  From the PCI
SR-IOV spec revision 1.1:

"ARI Capable Hierarchy is a hint to the Device that ARI has been enabled in the
Root Port or Switch Downstream Port immediately above the Device."

Ben

> 
> Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ