lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bff30bc1-a4cd-3ddd-4335-df0d0cc75244@oracle.com>
Date:   Wed, 8 Nov 2017 09:00:33 -0600
From:   Govinda Tatti <govinda.tatti@...cle.com>
To:     Roger Pau Monné <roger.pau@...rix.com>
Cc:     jgross@...e.com, xen-devel@...ts.xenproject.org,
        boris.ostrovsky@...cle.com, linux-kernel@...r.kernel.org,
        konrad.wilk@...cle.com
Subject: Re: [Xen-devel] [PATCH] Xen/pciback: Implement PCI slot or bus reset
 with 'do_flr' SysFS attribute

Thanks Roger for your review comments. Please see below for my comments.

On 11/7/2017 5:21 AM, Roger Pau Monné wrote:
> On Mon, Nov 06, 2017 at 12:48:42PM -0500, Govinda Tatti wrote:
>> The life-cycle of a PCI device in Xen pciback is complex and is constrained
>> by the generic PCI locking mechanism.
>>
>> - It starts with the device being bound to us, for which we do a function
>>    reset (done via SysFS so the PCI lock is held).
>> - If the device is unbound from us, we also do a function reset
>>    (done via SysFS so the PCI lock is held).
>> - If the device is un-assigned from a guest - we do a function reset
>>    (no PCI lock is held).
>>
>> All reset operations are done on the individual PCI function level
>> (so bus:device:function).
>>
>> The reset for an individual PCI function means device must support FLR
>> (PCIe or AF), PM reset on D3hot->D0 device specific reset, or a secondary
>> bus reset for a singleton device on a bus but FLR does not have widespread
>> support or it is not reliable in some cases. So, we need to provide an
>> alternate mechanism to users to perform a slot or bus level reset.
>>
>> Currently, a slot or bus reset is not exposed in SysFS as there is no good
>> way of exposing a bus topology there. This is due to the complexity -
>> we MUST know that the different functions of a PCIe device are not in use
>> by other drivers, or if they are in use (say one of them is assigned to a
>> guest and the other is  idle) - it is still OK to reset the slot (assuming
>> both of them are owned by Xen pciback).
>>
>> This patch does that by doing a slot or bus reset (if slot not supported)
>> if all of the functions of a PCIe device belong to Xen PCIback.
>>
>> Due to the complexity with the PCI lock we cannot do the reset when a
>> device is bound ('echo $BDF > bind') or when unbound ('echo $BDF > unbind')
>> as the pci_[slot|bus]_reset also takes the same lock resulting in a
>> dead-lock.
>>
>> Putting the reset function in a work-queue or thread won't work either -
>> as we have to do the reset function outside the 'unbind' context (it holds
>> the PCI lock). But once you 'unbind' a device the device is no longer under
>> the ownership of Xen pciback and the pci_set_drvdata has been reset, so
>> we cannot use a thread for this.
>>
>> Instead of doing all this complex dance, we depend on the tool-stack doing
>> the right thing. As such, we implement the 'do_flr' SysFS attribute which
>> 'xl' uses when a device is detached or attached from/to a guest. It
>> bypasses the need to worry about the PCI lock.
>>
>> To not inadvertently do a bus reset that would affect devices that are in
>> use by other drivers (other than Xen pciback) prior to the reset, we check
>> that all of the devices under the bridge are owned by Xen pciback. If they
>> are not, we refrain from executing the bus (or slot) reset.
>>
>> Signed-off-by: Govinda Tatti <Govinda.Tatti@...cle.COM>
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
>> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@...cle.com>
>> ---
>>   Documentation/ABI/testing/sysfs-driver-pciback |  12 +++
>>   drivers/xen/xen-pciback/pci_stub.c             | 125 +++++++++++++++++++++++++
>>   2 files changed, 137 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
>> index 6a733bf..ccf7dc0 100644
>> --- a/Documentation/ABI/testing/sysfs-driver-pciback
>> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
>> @@ -11,3 +11,15 @@ Description:
>>                   #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
>>                   will allow the guest to read and write to the configuration
>>                   register 0x0E.
>> +
>> +What:           /sys/bus/pci/drivers/pciback/do_flr
>> +Date:           Nov 2017
>> +KernelVersion:  4.15
>> +Contact:        xen-devel@...ts.xenproject.org
>> +Description:
>> +                An option to perform a slot or bus reset when a PCI device
>> +		is owned by Xen PCI backend. Writing a string of DDDD:BB:DD.F
>> +		will cause the pciback driver to perform a slot or bus reset
>> +		if the device supports it. It also checks to make sure that
>> +		all of the devices under the bridge are owned by Xen PCI
>> +		backend.
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> index 6331a95..2b2c269 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -244,6 +244,96 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>>   	return found_dev;
>>   }
>>   
>> +struct pcistub_args {
>> +	struct pci_dev *dev;
> const?
This field will point to first device that is not owned by pcistub.
>
>> +	int dcount;
> unsigned int.
OK.
>
>> +};
>> +
>> +static int pcistub_search_dev(struct pci_dev *dev, void *data)
> Seems like this function would better return a boolean rather than an
> int.
pcistub_search_dev() is invoked through pci_walk_bus() and it expects 
int return code.
>
>> +{
>> +	struct pcistub_device *psdev;
>> +	struct pcistub_args *arg = data;
>> +	bool found_dev = false;
>> +	unsigned long flags;
>> +
>> +	spin_lock_irqsave(&pcistub_devices_lock, flags);
>> +
>> +	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
>> +		if (psdev->dev == dev) {
>> +			found_dev = true;
>> +			arg->dcount++;
>> +			break;
>> +		}
>> +	}
>> +
>> +	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> +
>> +	/* Device not owned by pcistub, someone owns it. Abort the walk */
>> +	if (!found_dev)
>> +		arg->dev = dev;
>> +
>> +	return found_dev ? 0 : 1;
>> +}
>> +
>> +static int pcistub_reset_dev(struct pci_dev *dev)
>> +{
>> +	struct xen_pcibk_dev_data *dev_data;
>> +	bool slot = false, bus = false;
>> +
>> +	if (!dev)
>> +		return -EINVAL;
>> +
>> +	dev_dbg(&dev->dev, "[%s]\n", __func__);
>> +
>> +	if (!pci_probe_reset_slot(dev->slot)) {
>> +		slot = true;
>> +	} else if (!pci_probe_reset_bus(dev->bus)) {
>> +		/* We won't attempt to reset a root bridge. */
>> +		if (!pci_is_root_bus(dev->bus))
>> +			bus = true;
> Con't you join the two if, ie:
>
> } else if (!pci_probe_reset_bus(dev->bus) && !pci_is_root_bus(dev->bus)) {
Ok
>
>> +	}
>> +
>> +	if (!bus && !slot)
>> +		return -EOPNOTSUPP;
>> +
>> +	if (!slot) {
>> +		struct pcistub_args arg = { .dev = NULL, .dcount = 0 };
>> +
>> +		/*
>> +		 * Make sure all devices on this bus are owned by the
>> +		 * PCI backend so that we can safely reset the whole bus.
>> +		 */
>> +		pci_walk_bus(dev->bus, pcistub_search_dev, &arg);
>> +
>> +		/* All devices under the bus should be part of pcistub! */
>> +		if (arg.dev) {
>> +			dev_err(&dev->dev, "%s device on the bus is not owned by pcistub\n",
>> +				pci_name(arg.dev));
>> +
>> +			return -EBUSY;
> Not sure EBUSY is the best return code here, EINVAL?
I don't think EINVAL is right return code for this case.
>
>> +		}
>> +
>> +		dev_dbg(&dev->dev, "pcistub owns %d devices on the bus\n",
>> +			arg.dcount);
>> +	}
>> +
>> +	dev_data = pci_get_drvdata(dev);
>> +	if (!pci_load_saved_state(dev, dev_data->pci_saved_state))
>> +		pci_restore_state(dev);
>> +
>> +	/* This disables the device. */
>> +	xen_pcibk_reset_device(dev);
>> +
>> +	/* Cleanup up any emulated fields */
>> +	xen_pcibk_config_reset_dev(dev);
>> +
>> +	dev_dbg(&dev->dev, "resetting %s device using %s reset\n",
>> +		pci_name(dev), slot ? "slot" : "bus");
>> +
>> +	return slot ? pci_try_reset_slot(dev->slot) :
>> +			pci_try_reset_bus(dev->bus);
>> +}
>> +
>>   /*
>>    * Called when:
>>    *  - XenBus state has been reconfigure (pci unplug). See xen_pcibk_remove_device
>> @@ -1434,6 +1524,34 @@ static ssize_t permissive_show(struct device_driver *drv, char *buf)
>>   static DRIVER_ATTR(permissive, S_IRUSR | S_IWUSR, permissive_show,
>>   		   permissive_add);
>>   
>> +static ssize_t pcistub_do_flr(struct device_driver *drv, const char *buf,
>> +			      size_t count)
>> +{
>> +	struct pcistub_device *psdev;
>> +	int domain, bus, slot, func;
>> +	int err;
>> +
>> +	err = str_to_slot(buf, &domain, &bus, &slot, &func);
>> +	if (err)
>> +		goto out;
> 		return err;
>
>> +
>> +	psdev = pcistub_device_find(domain, bus, slot, func);
> 	if (!psdev)
> 		return -ENODEV;
>
>> +	if (psdev) {
>> +		err = pcistub_reset_dev(psdev->dev);
>> +		pcistub_device_put(psdev);
>> +	} else {
>> +		err = -ENODEV;
>> +	}
>> +
>> +out:
>> +	if (!err)
>> +		err = count;
>> +
>> +	return err;
> What's the purpose of returning count here?
Not sure. Will check with Konrad and address this comment.

I will post revised patch later this week. Thanks.

Cheers
GOVINDA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ