lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 28 Oct 2015 14:37:17 -0700 From: Alexander Duyck <alexander.duyck@...il.com> To: Bjorn Helgaas <helgaas@...nel.org> Cc: Alexander Duyck <aduyck@...antis.com>, bhelgaas@...gle.com, linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org, Ben Shelton <benjamin.h.shelton@...el.com> Subject: Re: [PATCH 2/5] iov: Reset resources to 0 if totalVFs increases after enabling ARI On 10/28/2015 12:52 PM, Bjorn Helgaas wrote: > On Wed, Oct 28, 2015 at 11:32:14AM -0700, Alexander Duyck wrote: >> On 10/28/2015 09:37 AM, Bjorn Helgaas wrote: >>> Hi Alex, >>> >>> On Tue, Oct 27, 2015 at 01:52:21PM -0700, Alexander Duyck wrote: >>>> This patch forces us to reallocate VF BARs if the totalVFs value has >>>> increased after enabling ARI. This normally shouldn't occur, however I >>>> have seen some non-spec devices that shift between 7 and some value greater >>>> than 7 based on the ARI value and we want to avoid triggering any issues >>>> with such devices. >>> >>> Can you include specifics about the devices? The value "7" is pretty >>> specific, so if we're going to include that level of detail, we should >>> have the actual device info to go with it. >> >> I referenced 7 as that is the largest number of VFs a single >> function can support assuming a single function without ARI and >> without the ability to handle Type 1 configuration requests. The >> Intel fm10k driver has logic in it that does a check for ARI and if >> it is supported it reports via sysfs a totalVFs of 64, otherwise it >> limits the totalVFs reported to 7. However, I don't believe it >> exposes the limitation via the configuration space. > > Ah, OK, that makes sense. > >>> I guess the problem is: >>> >>> - Device supports 7 TotalVFs with ARI disabled, >7 with ARI enabled >>> - Firmware leaves ARI disabled in SRIOV_CTRL >>> - Firmware computes size based on 7 VFs >>> - Firmware allocates space and programs BARs for 7 VFs >>> - Linux enables ARI, reads >7 TotalVFs >>> - Linux computes size based on >7 VFs >>> - Increased size may overlap other resources >>> >>> Right? >> >> Right. More than likely what will happen is that you will see >> overlap of the device on itself if it has multiple base address >> registers assigned to the VFs. >> >>>> Fixes: 3aa71da412fe ("PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs") >>>> Signed-off-by: Alexander Duyck <aduyck@...antis.com> >>>> --- >>>> drivers/pci/iov.c | 11 ++++++++++- >>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c >>>> index 099050d78a39..238950412de0 100644 >>>> --- a/drivers/pci/iov.c >>>> +++ b/drivers/pci/iov.c >>>> @@ -393,7 +393,7 @@ static int sriov_init(struct pci_dev *dev, int pos) >>>> int rc; >>>> int nres; >>>> u32 pgsz; >>>> - u16 ctrl, total; >>>> + u16 ctrl, total, orig_total; >>>> struct pci_sriov *iov; >>>> struct resource *res; >>>> struct pci_dev *pdev; >>>> @@ -402,6 +402,7 @@ static int sriov_init(struct pci_dev *dev, int pos) >>>> pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT) >>>> return -ENODEV; >>>> >>>> + pci_read_config_word(dev, pos + PCI_SRIOV_TOTAL_VF, &orig_total); >>>> pci_read_config_word(dev, pos + PCI_SRIOV_CTRL, &ctrl); >>>> if (ctrl & PCI_SRIOV_CTRL_VFE) { >>>> pci_write_config_word(dev, pos + PCI_SRIOV_CTRL, 0); >>>> @@ -450,6 +451,14 @@ found: >>>> } >>>> iov->barsz[i] = resource_size(res); >>>> res->end = res->start + resource_size(res) * total - 1; >>>> + >>>> + /* force reallocation of BARs if total VFs increased */ >>>> + if (orig_total < total) { >>>> + res->flags |= IORESOURCE_UNSET; >>>> + res->end -= res->start; >>>> + res->start = 0; >>>> + } >>> >>> Two thoughts here: >>> >>> 1) Even if the required space increased, it's possible that firmware >>> placed the BAR somewhere where the extra space is available. In that >>> case, this forces reallocation unnecessarily. >> >> I'd say it is possible, but not likely. From past experience I have >> seen BIOSes do some very dumb things when it comes to SR-IOV, >> assuming they even support it. >> >> In addition many of the VF devices out there support more than one >> base address register per function. The Intel NICs for example have >> one for device registers and one for MSI-X registers. And most >> BIOSes usually pack one right after the other from what I have seen. >> So while there may be more space there what usually happens is that >> the MSI-X region will have to be relocated in order to make room for >> expanding the other base address register. >> >> My last bit on all this is that VFs are meant to be assigned into >> guests. I would argue that for the sake of security we are much >> better off invalidating the VF base address registers and forcing a >> reallocation if there is even a risk of the VF base address register >> space overlapping with some other piece of host memory. We don't >> want to risk possibly exposing any bits of the host that we didn't >> intend on. > > Agreed, not likely for several reasons. > >>> 2) This *feels* like something the PCI core should be doing anyway, >>> even without any help here. Shouldn't we fail in pci_claim_resource() >>> and set IORESOURCE_UNSET there? > > This is really the core of my question -- what problem does this patch > solve? I'm trying to figure out if delaying the read of TotalVFs > until after we set ARI Capable Hierarchy is sufficient, and if it's > not sufficient, *why* not? I suppose you have a point. As long as the PCI core is taking care of any overlaps in pci_claim_resource that is probably good enough. - Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists