lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue,  3 Jul 2018 14:50:40 +0800
From:   Pingfan Liu <kernelfans@...il.com>
To:     linux-kernel@...r.kernel.org
Cc:     Pingfan Liu <kernelfans@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Grygorii Strashko <grygorii.strashko@...com>,
        Christoph Hellwig <hch@...radead.org>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Dave Young <dyoung@...hat.com>, linux-pci@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org
Subject: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

commit 52cdbdd49853 ("driver core: correct device's shutdown order")
places an assumption of supplier<-consumer order on the process of probe.
But it turns out to break down the parent <- child order in some scene.
E.g in pci, a bridge is enabled by pci core, and behind it, the devices
have been probed. Then comes the bridge's module, which enables extra
feature(such as hotplug) on this bridge. This will break the
parent<-children order and cause failure when "kexec -e" in some scenario.

The detailed description of the scenario:
An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
to some issue. For this case, the bridge is moved after its children in
devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
write back buffer in flight due to the former shutdown of the bridge which
clears the BusMaster bit.

It is a little hard to impose both "parent<-child" and "supplier<-consumer"
order on devices_kset. Take the following scene:
step0: before a consumer's probing, (note child_a is supplier of consumer_a)
  [ consumer-X, child_a, ...., child_z] [... consumer_a, ..., consumer_z, ...] supplier-X
                                         ^^^^^^^^^^ affected range ^^^^^^^^^^
step1: when probing, moving consumer-X after supplier-X
  [ child_a, ...., child_z] [.... consumer_a, ..., consumer_z, ...] supplier-X, consumer-X
step2: the children of consumer-X should be re-ordered to maintain the seq
  [... consumer_a, ..., consumer_z, ....] supplier-X  [consumer-X, child_a, ...., child_z]
step3: the consumer_a should be re-ordered to maintain the seq
  [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., child_z]

It requires two nested recursion to drain out all out-of-order item in
"affected range". To avoid such complicated code, this patch suggests
to utilize the info in device tree, instead of using the order of
devices_kset during shutdown. It iterates the device tree, and firstly
shutdown a device's children and consumers. After this patch, the buggy
commit is hollow and left to clean.

Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
Cc: Grygorii Strashko <grygorii.strashko@...com>
Cc: Christoph Hellwig <hch@...radead.org>
Cc: Bjorn Helgaas <helgaas@...nel.org>
Cc: Dave Young <dyoung@...hat.com>
Cc: linux-pci@...r.kernel.org
Cc: linuxppc-dev@...ts.ozlabs.org
Signed-off-by: Pingfan Liu <kernelfans@...il.com>
---
 drivers/base/core.c    | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 include/linux/device.h |  1 +
 2 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index a48868f..684b994 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
 	INIT_LIST_HEAD(&dev->links.consumers);
 	INIT_LIST_HEAD(&dev->links.suppliers);
 	dev->links.status = DL_DEV_NO_DRIVER;
+	dev->shutdown = false;
 }
 EXPORT_SYMBOL_GPL(device_initialize);
 
@@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
 	 * lock is to be held
 	 */
 	parent = get_device(dev->parent);
-	get_device(dev);
 	/*
 	 * Make sure the device is off the kset list, in the
 	 * event that dev->*->shutdown() doesn't remove it.
@@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
 			dev_info(dev, "shutdown\n");
 		dev->driver->shutdown(dev);
 	}
-
+	dev->shutdown = true;
 	device_unlock(dev);
 	if (parent)
 		device_unlock(parent);
 
-	put_device(dev);
 	put_device(parent);
 	spin_lock(&devices_kset->list_lock);
 }
 
+/* shutdown dev's children and consumer firstly, then itself */
+static int device_for_each_child_shutdown(struct device *dev)
+{
+	struct klist_iter i;
+	struct device *child;
+	struct device_link *link;
+
+	/* already shutdown, then skip this sub tree */
+	if (dev->shutdown)
+		return 0;
+
+	if (!dev->p)
+		goto check_consumers;
+
+	/* there is breakage of lock in __device_shutdown(), and the redundant
+	 * ref++ on srcu protected consumer is harmless since shutdown is not
+	 * hot path.
+	 */
+	get_device(dev);
+
+	klist_iter_init(&dev->p->klist_children, &i);
+	while ((child = next_device(&i)))
+		device_for_each_child_shutdown(child);
+	klist_iter_exit(&i);
+
+check_consumers:
+	list_for_each_entry_rcu(link, &dev->links.consumers, s_node) {
+		if (!link->consumer->shutdown)
+			device_for_each_child_shutdown(link->consumer);
+	}
+
+	__device_shutdown(dev);
+	put_device(dev);
+	return 0;
+}
+
 /**
  * device_shutdown - call ->shutdown() on each device to shutdown.
  */
 void device_shutdown(void)
 {
 	struct device *dev;
+	int idx;
 
+	idx = device_links_read_lock();
 	spin_lock(&devices_kset->list_lock);
 	/*
 	 * Walk the devices list backward, shutting down each in turn.
@@ -2866,11 +2903,12 @@ void device_shutdown(void)
 	 * devices offline, even as the system is shutting down.
 	 */
 	while (!list_empty(&devices_kset->list)) {
-		dev = list_entry(devices_kset->list.prev, struct device,
+		dev = list_entry(devices_kset->list.next, struct device,
 				kobj.entry);
-		__device_shutdown(dev);
+		device_for_each_child_shutdown(dev);
 	}
 	spin_unlock(&devices_kset->list_lock);
+	device_links_read_unlock(idx);
 }
 
 /*
diff --git a/include/linux/device.h b/include/linux/device.h
index 055a69d..8a0f784 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -1003,6 +1003,7 @@ struct device {
 	bool			offline:1;
 	bool			of_node_reused:1;
 	bool			dma_32bit_limit:1;
+	bool			shutdown:1; /* one direction: false->true */
 };
 
 static inline struct device *kobj_to_dev(struct kobject *kobj)
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ