lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Jun 2016 23:37:02 +0800
From:	Yongji Xie <xyjxie@...ux.vnet.ibm.com>
To:	Alex Williamson <alex.williamson@...hat.com>
Cc:	kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
	bhelgaas@...gle.com, aik@...abs.ru, benh@...nel.crashing.org,
	paulus@...ba.org, mpe@...erman.id.au, warrier@...ux.vnet.ibm.com,
	zhong@...ux.vnet.ibm.com, nikunj@...ux.vnet.ibm.com,
	gwshan@...ux.vnet.ibm.com, kevin.tian@...el.com
Subject: Re: [PATCH v4] vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio
 page is exclusive

Hi, Alex

On 2016/6/24 11:37, Alex Williamson wrote:

> On Fri, 24 Jun 2016 10:52:58 +0800
> Yongji Xie <xyjxie@...ux.vnet.ibm.com> wrote:
>> On 2016/6/24 0:12, Alex Williamson wrote:
>>> On Mon, 30 May 2016 21:06:37 +0800
>>> Yongji Xie <xyjxie@...ux.vnet.ibm.com> wrote:
>>>> +static void vfio_pci_probe_mmaps(struct vfio_pci_device *vdev)
>>>> +{
>>>> +	struct resource *res;
>>>> +	int bar;
>>>> +	struct vfio_pci_dummy_resource *dummy_res;
>>>> +
>>>> +	INIT_LIST_HEAD(&vdev->dummy_resources_list);
>>>> +
>>>> +	for (bar = PCI_STD_RESOURCES; bar <= PCI_STD_RESOURCE_END; bar++) {
>>>> +		res = vdev->pdev->resource + bar;
>>>> +
>>>> +		if (!IS_ENABLED(CONFIG_VFIO_PCI_MMAP))
>>>> +			goto no_mmap;
>>>> +
>>>> +		if (!(res->flags & IORESOURCE_MEM))
>>>> +			goto no_mmap;
>>>> +
>>>> +		/*
>>>> +		 * The PCI core shouldn't set up a resource with a
>>>> +		 * type but zero size. But there may be bugs that
>>>> +		 * cause us to do that.
>>>> +		 */
>>>> +		if (!resource_size(res))
>>>> +			goto no_mmap;
>>>> +
>>>> +		if (resource_size(res) >= PAGE_SIZE) {
>>>> +			vdev->bar_mmap_supported[bar] = true;
>>>> +			continue;
>>>> +		}
>>>> +
>>>> +		if (!(res->start & ~PAGE_MASK)) {
>>>> +			/*
>>>> +			 * Add a dummy resource to reserve the remainder
>>>> +			 * of the exclusive page in case that hot-add
>>>> +			 * device's bar is assigned into it.
>>>> +			 */
>>>> +			dummy_res = kzalloc(sizeof(*dummy_res), GFP_KERNEL);
>>>> +			if (dummy_res == NULL)
>>>> +				goto no_mmap;
>>>> +
>>>> +			dummy_res->resource.start = res->end + 1;
>>>> +			dummy_res->resource.end = res->start + PAGE_SIZE - 1;
>>>> +			dummy_res->resource.flags = res->flags;
>>>> +			if (request_resource(res->parent,
>>>> +						&dummy_res->resource)) {
>>>> +				kfree(dummy_res);
>>>> +				goto no_mmap;
>>>> +			}
>>> Isn't it true that request_resource() only tells us that at a given
>>> point in time, no other drivers have reserved that resource?  It seems
>>> like it does not guarantee that the resource isn't routed to another
>>> device or that another driver won't at some point attempt to request
>>> that same resource.  So for example if a user constructs their initrd
>>> to bind vfio-pci to devices before other modules load, this
>>> request_resource() may succeed, at the expense of drivers loaded later
>>> now failing.  The behavior will depend on driver load order and we're
>>> not actually insuring that the overflow resource is unused, just that
>>> we got it first.  Can we do better?  Am I missing something that
>>> prevents this?  Thanks,
>>>
>>> Alex
>> Couldn't PCI resources allocator prevent this, which will find a
>> empty slot in the resource tree firstly, then try to request that
>> resource in allocate_resource() when a PCI device is probed.
>> And I'd like to know why a PCI device driver would attempt to
>> call request_resource()? Should this be done in PCI enumeration?
> Hi Yongji,
>
> Looks like most pci drivers call pci_request_regions().  From there the
> call path is:
>
> pci_request_selected_regions
>    __pci_request_selected_regions
>      __pci_request_region
>        __request_mem_region
>          __request_region
>            __request_resource
>
> We see this driver ordering issue sometimes with users attempting to
> blacklist native pci drivers, trying to leave a device free for use by
> vfio-pci.  If the device is a graphics card, the generic vesa or uefi
> driver can request device resources causing a failure when vfio-pci
> tries to request those same resources.  I expect that unless it's a
> boot device, like vga in my example, the resources are not enabled
> until the driver opens the device, therefore the request_resource() call
> doesn't occur until that point.
>
> For another trivial example, look at /proc/iomem as you load and unload
> a driver, on my laptop with e1000e unloaded I see:
>
>    e1200000-e121ffff : 0000:00:19.0
>    e123e000-e123efff : 0000:00:19.0
>
> When e1000e is loaded, each of these becomes claimed by the e1000e
> driver:
>
>    e1200000-e121ffff : 0000:00:19.0
>      e1200000-e121ffff : e1000e
>    e123e000-e123efff : 0000:00:19.0
>      e123e000-e123efff : e1000e
>
> Clearly pci core knows the resource is associated with the device, but
> I don't think we're tapping into that with request_resource(), we're
> just potentially stealing resources that another driver might have
> claimed otherwise as I described above.  That's my suspicion at
> least, feel free to show otherwise if it's incorrect.  Thanks,
>
> Alex
>

Thanks for your explanation. But I still have one question.
Shouldn't PCI core have claimed all PCI device's resources
after probing those devices. If so, request_resource() will fail
when vfio-pci try to steal resources that another driver might
request later. Anything I missed here?  Some device resources
would not be claimed in PCI core?

Thanks,
Yongji

Powered by blists - more mailing lists