lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49C74FCC.7070308@jp.fujitsu.com>
Date:	Mon, 23 Mar 2009 18:01:00 +0900
From:	Kenji Kaneshige <kaneshige.kenji@...fujitsu.com>
To:	Alex Chiang <achiang@...com>
CC:	jbarnes@...tuousgeek.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org, Trent Piepho <xyzzy@...akeasy.org>
Subject: Re: [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove

Alex Chiang wrote:
> This patch adds an attribute named "remove" to a PCI device's sysfs
> directory.  Writing a non-zero value to this attribute will remove the PCI
> device and any children of it.
> 
> Trent Piepho wrote the original implementation and documentation.
> 
> Thanks to Vegard Nossum for testing under kmemcheck and finding locking
> issues with the sysfs interface.
> 
> Cc: Trent Piepho <xyzzy@...akeasy.org>
> Signed-off-by: Alex Chiang <achiang@...com>
> ---
> 
>  Documentation/ABI/testing/sysfs-bus-pci |    8 +++++++
>  Documentation/filesystems/sysfs-pci.txt |   10 +++++++++
>  drivers/pci/pci-sysfs.c                 |   36 +++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 1697a16..1350fa6 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -66,6 +66,14 @@ Description:
>  		re-discover previously removed devices.
>  		Depends on CONFIG_HOTPLUG.
>  
> +What:		/sys/bus/pci/devices/.../remove
> +Date:		January 2009
> +Contact:	Linux PCI developers <linux-pci@...r.kernel.org>
> +Description:
> +		Writing a non-zero value to this attribute will
> +		hot-remove the PCI device and any of its children.
> +		Depends on CONFIG_HOTPLUG.
> +
>  What:		/sys/bus/pci/devices/.../vpd
>  Date:		February 2008
>  Contact:	Ben Hutchings <bhutchings@...arflare.com>
> diff --git a/Documentation/filesystems/sysfs-pci.txt b/Documentation/filesystems/sysfs-pci.txt
> index 9f8740c..26e4b8b 100644
> --- a/Documentation/filesystems/sysfs-pci.txt
> +++ b/Documentation/filesystems/sysfs-pci.txt
> @@ -12,6 +12,7 @@ that support it.  For example, a given bus might look like this:
>       |   |-- enable
>       |   |-- irq
>       |   |-- local_cpus
> +     |   |-- remove
>       |   |-- resource
>       |   |-- resource0
>       |   |-- resource1
> @@ -36,6 +37,7 @@ files, each with their own function.
>         enable	           Whether the device is enabled (ascii, rw)
>         irq		   IRQ number (ascii, ro)
>         local_cpus	   nearby CPU mask (cpumask, ro)
> +       remove		   remove device from kernel's list (ascii, wo)
>         resource		   PCI resource host addresses (ascii, ro)
>         resource0..N	   PCI resource N, if present (binary, mmap)
>         resource0_wc..N_wc  PCI WC map resource N, if prefetchable (binary, mmap)
> @@ -46,6 +48,7 @@ files, each with their own function.
>  
>    ro - read only file
>    rw - file is readable and writable
> +  wo - write only file
>    mmap - file is mmapable
>    ascii - file contains ascii text
>    binary - file contains binary data
> @@ -73,6 +76,13 @@ that the device must be enabled for a rom read to return data succesfully.
>  In the event a driver is not bound to the device, it can be enabled using the
>  'enable' file, documented above.
>  
> +The 'remove' file is used to remove the PCI device, by writing a non-zero
> +integer to the file.  This does not involve any kind of hot-plug functionality,
> +e.g. powering off the device.  The device is removed from the kernel's list of
> +PCI devices, the sysfs directory for it is removed, and the device will be
> +removed from any drivers attached to it. Removal of PCI root buses is
> +disallowed.
> +
>  Accessing legacy resources through sysfs
>  ----------------------------------------
>  
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index be7468a..e16990e 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -243,6 +243,39 @@ struct bus_attribute pci_bus_attrs[] = {
>  	__ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, bus_rescan_store),
>  	__ATTR_NULL
>  };
> +
> +static void remove_callback(struct device *dev)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	mutex_lock(&pci_remove_rescan_mutex);
> +	pci_remove_bus_device(pdev);
> +	mutex_unlock(&pci_remove_rescan_mutex);
> +}
> +
> +static ssize_t
> +remove_store(struct device *dev, struct device_attribute *dummy,
> +	     const char *buf, size_t count)
> +{
> +	int ret = 0;
> +	unsigned long val;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (strict_strtoul(buf, 0, &val) < 0)
> +		return -EINVAL;
> +
> +	if (pci_is_root_bus(pdev->bus))
> +		return -EBUSY;
> +
> +	/* An attribute cannot be unregistered by one of its own methods,
> +	 * so we have to use this roundabout approach.
> +	 */
> +	if (val)
> +		ret = device_schedule_callback(dev, remove_callback);
> +	if (ret)
> +		count = ret;
> +	return count;
> +}
>  #endif
>  

I still have the following kernel error messages in testing with your
latest set of patches (Jesse's linux-next). The test case is removing
e1000e device or its parent bridge by "echo 1 > /sys/bus/pci/devices/
.../remove".

[  537.379995] =============================================
[  537.380124] [ INFO: possible recursive locking detected ]
[  537.380128] 2.6.29-rc8-kk #1
[  537.380128] ---------------------------------------------
[  537.380128] events/4/56 is trying to acquire lock:
[  537.380128]  (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
[  537.380128]
[  537.380128] but task is already holding lock:
[  537.380128]  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[  537.380128]
[  537.380128] other info that might help us debug this:
[  537.380128] 3 locks held by events/4/56:
[  537.380128]  #0:  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[  537.380128]  #1:  (&ss->work){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
[  537.380128]  #2:  (pci_remove_rescan_mutex){--..}, at: [<ffffffff803c10d1>] remove_callback+0x21/0x40
[  537.380128]
[  537.380128] stack backtrace:
[  537.380128] Pid: 56, comm: events/4 Not tainted 2.6.29-rc8-kk #1
[  537.380128] Call Trace:
[  537.380128]  [<ffffffff8026dfcd>] validate_chain+0xb7d/0x1260
[  537.380128]  [<ffffffff8026eade>] __lock_acquire+0x42e/0xa40
[  537.380128]  [<ffffffff8026f148>] lock_acquire+0x58/0x80
[  537.380128]  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[  537.380128]  [<ffffffff8025800d>] flush_workqueue+0x4d/0xa0
[  537.380128]  [<ffffffff80257fc0>] ? flush_workqueue+0x0/0xa0
[  537.383380]  [<ffffffff80258070>] flush_scheduled_work+0x10/0x20
[  537.383380]  [<ffffffffa0144065>] e1000_remove+0x55/0xfe [e1000e]
[  537.383380]  [<ffffffff8033ee30>] ? sysfs_schedule_callback_work+0x0/0x50
[  537.383380]  [<ffffffff803bfeb2>] pci_device_remove+0x32/0x70
[  537.383380]  [<ffffffff80441da9>] __device_release_driver+0x59/0x90
[  537.383380]  [<ffffffff80441edb>] device_release_driver+0x2b/0x40
[  537.383380]  [<ffffffff804419d6>] bus_remove_device+0xa6/0x120
[  537.384382]  [<ffffffff8043e46b>] device_del+0x12b/0x190
[  537.384382]  [<ffffffff8043e4f6>] device_unregister+0x26/0x70
[  537.384382]  [<ffffffff803ba969>] pci_stop_dev+0x49/0x60
[  537.384382]  [<ffffffff803baab0>] pci_remove_bus_device+0x40/0xc0
[  537.384382]  [<ffffffff803c10d9>] remove_callback+0x29/0x40
[  537.384382]  [<ffffffff8033ee4f>] sysfs_schedule_callback_work+0x1f/0x50
[  537.384382]  [<ffffffff8025769a>] run_workqueue+0x15a/0x230
[  537.384382]  [<ffffffff80257648>] ? run_workqueue+0x108/0x230
[  537.384382]  [<ffffffff8025846f>] worker_thread+0x9f/0x100
[  537.384382]  [<ffffffff8025bce0>] ? autoremove_wake_function+0x0/0x40
[  537.384382]  [<ffffffff802583d0>] ? worker_thread+0x0/0x100
[  537.384382]  [<ffffffff8025b89d>] kthread+0x4d/0x80
[  537.384382]  [<ffffffff8020d4ba>] child_rip+0xa/0x20
[  537.386380]  [<ffffffff8020cebc>] ? restore_args+0x0/0x30
[  537.386380]  [<ffffffff8025b850>] ? kthread+0x0/0x80
[  537.386380]  [<ffffffff8020d4b0>] ? child_rip+0x0/0x20

I think the cause of this error message is flush_workqueue() from the
work of keventd. When removing device using "/sys/bus/pci/devices/.../
remove", pci_remove_bus_device() is executed by the keventd's work
through device_schedule_callback(), and it invokes e1000e's remove
callback. And then, e1000e's remove callback invokes flush_workqueue().
Actually, the kernel error messages are not displayed when I changed
e1000e driver to not call flush_workqueue(). In my understanding, flush_workqueue() from the work must be avoided because it can cause
a deadlock. Please note that this is not a problem of e1000e driver.
Drivers can use flush_workqueue(), of course.

BTW, I also have another worry about executing pci_remove_bus_device()
by the work of keventd. The pci_remove_bus_device() will take a long
time  especially when the bridge device near the root bus is specified.
The long delay of keventd's work will have bad effects to other works
on the workqueue.

Thanks,
Kenji Kaneshige



>  struct device_attribute pci_dev_attrs[] = {
> @@ -263,6 +296,9 @@ struct device_attribute pci_dev_attrs[] = {
>  	__ATTR(broken_parity_status,(S_IRUGO|S_IWUSR),
>  		broken_parity_status_show,broken_parity_status_store),
>  	__ATTR(msi_bus, 0644, msi_bus_show, msi_bus_store),
> +#ifdef CONFIG_HOTPLUG
> +	__ATTR(remove, (S_IWUSR|S_IWGRP), NULL, remove_store),
> +#endif
>  	__ATTR_NULL,
>  };
>  
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ