lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 24 Jun 2016 23:37:02 +0800 From: Yongji Xie <xyjxie@...ux.vnet.ibm.com> To: Alex Williamson <alex.williamson@...hat.com> Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, bhelgaas@...gle.com, aik@...abs.ru, benh@...nel.crashing.org, paulus@...ba.org, mpe@...erman.id.au, warrier@...ux.vnet.ibm.com, zhong@...ux.vnet.ibm.com, nikunj@...ux.vnet.ibm.com, gwshan@...ux.vnet.ibm.com, kevin.tian@...el.com Subject: Re: [PATCH v4] vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive Hi, Alex On 2016/6/24 11:37, Alex Williamson wrote: > On Fri, 24 Jun 2016 10:52:58 +0800 > Yongji Xie <xyjxie@...ux.vnet.ibm.com> wrote: >> On 2016/6/24 0:12, Alex Williamson wrote: >>> On Mon, 30 May 2016 21:06:37 +0800 >>> Yongji Xie <xyjxie@...ux.vnet.ibm.com> wrote: >>>> +static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev) >>>> +{ >>>> + struct resource *res; >>>> + int bar; >>>> + struct vfio_pci_dummy_resource *dummy_res; >>>> + >>>> + INIT_LIST_HEAD(&vdev->dummy_resources_list); >>>> + >>>> + for (bar = PCI_STD_RESOURCES; bar <= PCI_STD_RESOURCE_END; bar++) { >>>> + res = vdev->pdev->resource + bar; >>>> + >>>> + if (!IS_ENABLED(CONFIG_VFIO_PCI_MMAP)) >>>> + goto no_mmap; >>>> + >>>> + if (!(res->flags & IORESOURCE_MEM)) >>>> + goto no_mmap; >>>> + >>>> + /* >>>> + * The PCI core shouldn't set up a resource with a >>>> + * type but zero size. But there may be bugs that >>>> + * cause us to do that. >>>> + */ >>>> + if (!resource_size(res)) >>>> + goto no_mmap; >>>> + >>>> + if (resource_size(res) >= PAGE_SIZE) { >>>> + vdev->bar_mmap_supported[bar] = true; >>>> + continue; >>>> + } >>>> + >>>> + if (!(res->start & ~PAGE_MASK)) { >>>> + /* >>>> + * Add a dummy resource to reserve the remainder >>>> + * of the exclusive page in case that hot-add >>>> + * device's bar is assigned into it. >>>> + */ >>>> + dummy_res = kzalloc(sizeof(*dummy_res), GFP_KERNEL); >>>> + if (dummy_res == NULL) >>>> + goto no_mmap; >>>> + >>>> + dummy_res->resource.start = res->end + 1; >>>> + dummy_res->resource.end = res->start + PAGE_SIZE - 1; >>>> + dummy_res->resource.flags = res->flags; >>>> + if (request_resource(res->parent, >>>> + &dummy_res->resource)) { >>>> + kfree(dummy_res); >>>> + goto no_mmap; >>>> + } >>> Isn't it true that request_resource() only tells us that at a given >>> point in time, no other drivers have reserved that resource? It seems >>> like it does not guarantee that the resource isn't routed to another >>> device or that another driver won't at some point attempt to request >>> that same resource. So for example if a user constructs their initrd >>> to bind vfio-pci to devices before other modules load, this >>> request_resource() may succeed, at the expense of drivers loaded later >>> now failing. The behavior will depend on driver load order and we're >>> not actually insuring that the overflow resource is unused, just that >>> we got it first. Can we do better? Am I missing something that >>> prevents this? Thanks, >>> >>> Alex >> Couldn't PCI resources allocator prevent this, which will find a >> empty slot in the resource tree firstly, then try to request that >> resource in allocate_resource() when a PCI device is probed. >> And I'd like to know why a PCI device driver would attempt to >> call request_resource()? Should this be done in PCI enumeration? > Hi Yongji, > > Looks like most pci drivers call pci_request_regions(). From there the > call path is: > > pci_request_selected_regions > __pci_request_selected_regions > __pci_request_region > __request_mem_region > __request_region > __request_resource > > We see this driver ordering issue sometimes with users attempting to > blacklist native pci drivers, trying to leave a device free for use by > vfio-pci. If the device is a graphics card, the generic vesa or uefi > driver can request device resources causing a failure when vfio-pci > tries to request those same resources. I expect that unless it's a > boot device, like vga in my example, the resources are not enabled > until the driver opens the device, therefore the request_resource() call > doesn't occur until that point. > > For another trivial example, look at /proc/iomem as you load and unload > a driver, on my laptop with e1000e unloaded I see: > > e1200000-e121ffff : 0000:00:19.0 > e123e000-e123efff : 0000:00:19.0 > > When e1000e is loaded, each of these becomes claimed by the e1000e > driver: > > e1200000-e121ffff : 0000:00:19.0 > e1200000-e121ffff : e1000e > e123e000-e123efff : 0000:00:19.0 > e123e000-e123efff : e1000e > > Clearly pci core knows the resource is associated with the device, but > I don't think we're tapping into that with request_resource(), we're > just potentially stealing resources that another driver might have > claimed otherwise as I described above. That's my suspicion at > least, feel free to show otherwise if it's incorrect. Thanks, > > Alex > Thanks for your explanation. But I still have one question. Shouldn't PCI core have claimed all PCI device's resources after probing those devices. If so, request_resource() will fail when vfio-pci try to steal resources that another driver might request later. Anything I missed here? Some device resources would not be claimed in PCI core? Thanks, Yongji
Powered by blists - more mailing lists