[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201103203302.210176223@linuxfoundation.org>
Date: Tue, 3 Nov 2020 21:36:09 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Xiang Chen <chenxiang66@...ilicon.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Subject: [PATCH 5.4 121/214] PM: runtime: Remove link state checks in rpm_get/put_supplier()
From: Xiang Chen <chenxiang66@...ilicon.com>
commit d12544fb2aa9944b180c35914031a8384ab082c1 upstream.
To support runtime PM for hisi SAS driver (the driver is in directory
drivers/scsi/hisi_sas), we add device link between scsi_device->sdev_gendev
(consumer device) and hisi_hba->dev(supplier device) with flags
DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE.
After runtime suspended consumers and supplier, unload the dirver which
causes a hung.
We found that it called function device_release_driver_internal() to
release the supplier device (hisi_hba->dev), as the device link was
busy, it set the device link state to DL_STATE_SUPPLIER_UNBIND, and
then it called device_release_driver_internal() to release the consumer
device (scsi_device->sdev_gendev).
Then it would try to call pm_runtime_get_sync() to resume the consumer
device, but because consumer-supplier relation existed, it would try
to resume the supplier first, but as the link state was already
DL_STATE_SUPPLIER_UNBIND, so it skipped resuming the supplier and only
resumed the consumer which hanged (it sends IOs to resume scsi_device
while the SAS controller is suspended).
Simple flow is as follows:
device_release_driver_internal -> (supplier device)
if device_links_busy ->
device_links_unbind_consumers ->
...
WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND)
device_release_driver_internal (consumer device)
pm_runtime_get_sync -> (consumer device)
...
__rpm_callback ->
rpm_get_suppliers ->
if link->state == DL_STATE_SUPPLIER_UNBIND -> skip the action of resuming the supplier
...
pm_runtime_clean_up_links
...
Correct suspend/resume ordering between a supplier device and its consumer
devices (resume the supplier device before resuming consumer devices, and
suspend consumer devices before suspending the supplier device) should be
guaranteed by runtime PM, but the state checks in rpm_get_supplier() and
rpm_put_supplier() break this rule, so remove them.
Signed-off-by: Xiang Chen <chenxiang66@...ilicon.com>
[ rjw: Subject and changelog edits ]
Cc: All applicable <stable@...r.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
drivers/base/power/runtime.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -291,8 +291,7 @@ static int rpm_get_suppliers(struct devi
device_links_read_lock_held()) {
int retval;
- if (!(link->flags & DL_FLAG_PM_RUNTIME) ||
- READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND)
+ if (!(link->flags & DL_FLAG_PM_RUNTIME))
continue;
retval = pm_runtime_get_sync(link->supplier);
@@ -312,8 +311,6 @@ static void rpm_put_suppliers(struct dev
list_for_each_entry_rcu(link, &dev->links.suppliers, c_node,
device_links_read_lock_held()) {
- if (READ_ONCE(link->status) == DL_STATE_SUPPLIER_UNBIND)
- continue;
while (refcount_dec_not_one(&link->rpm_active))
pm_runtime_put(link->supplier);
Powered by blists - more mailing lists