lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1458284463-12743-1-git-send-email-rajesh.bhagat@nxp.com>
Date:	Fri, 18 Mar 2016 12:31:03 +0530
From:	Rajesh Bhagat <rajesh.bhagat@....com>
To:	<linux-usb@...r.kernel.org>, <linux-kernel@...r.kernel.org>
CC:	<gregkh@...uxfoundation.org>, <mathias.nyman@...el.com>,
	<sriram.dash@....com>, Rajesh Bhagat <rajesh.bhagat@....com>
Subject: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI commmand timeout

We are facing issue while performing the system resume operation from STR
where XHCI is going to indefinite hang/sleep state due to
wait_for_completion API called in function xhci_alloc_dev for command
TRB_ENABLE_SLOT which never completes.

Now, xhci_handle_command_timeout function is called and prints
"Command timeout" message but never calls complete API for above
TRB_ENABLE_SLOT command as xhci_abort_cmd_ring is successful.

Solution to above problem is:
1. calling xhci_cleanup_command_queue API even if xhci_abort_cmd_ring
   is successful or not.
2. checking the status of reset_device in usb core code.

Before Fix:
root@...enix:~# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: suspend of devices complete after 103.144 msecs
PM: late suspend of devices complete after 1.503 msecs
PM: noirq suspend of devices complete after 1.220 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
Retrying again to check for CPU kill
CPU1 killed.
Enabling non-boot CPUs ...
CPU1 is up
PM: noirq resume of devices complete after 1.996 msecs
PM: early resume of devices complete after 1.152 msecs
usb usb1: root hub lost power or was reset
usb usb2: root hub lost power or was reset
----- <<hangs indefinitely>> --------------

After Fix:
root@...enix:~#  echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: suspend of devices complete after 103.086 msecs
PM: late suspend of devices complete after 1.517 msecs
PM: noirq suspend of devices complete after 1.217 msecs
Disabling non-boot CPUs ...
CPU1: shutdown
Retrying again to check for CPU kill
CPU1 killed.
Enabling non-boot CPUs ...
CPU1 is up
PM: noirq resume of devices complete after 1.991 msecs
PM: early resume of devices complete after 1.239 msecs
usb usb1: root hub lost power or was reset
usb usb2: root hub lost power or was reset
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
xhci-hcd xhci-hcd.0.auto: Error while assigning device slot ID
xhci-hcd xhci-hcd.0.auto: Max number of devices this xHCI host supports is 127.
PM: resume of devices complete after 75567.769 msecs
Restarting tasks ...
usb 1-1: USB disconnect, device number 2
usb 2-1: USB disconnect, device number 2
usb 2-1.1: USB disconnect, device number 3
done.
root@...enix:~#

Signed-off-by: Sriram Dash <sriram.dash@....com>
Signed-off-by: Rajesh Bhagat <rajesh.bhagat@....com>
---
 drivers/usb/core/hub.c       |   12 ++++++++----
 drivers/usb/host/xhci-ring.c |    2 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 38cc4ba..c906018 100644
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -2897,10 +2897,14 @@ done:
 			/* The xHC may think the device is already reset,
 			 * so ignore the status.
 			 */
-			if (hcd->driver->reset_device)
-				hcd->driver->reset_device(hcd, udev);
-
-			usb_set_device_state(udev, USB_STATE_DEFAULT);
+			if (hcd->driver->reset_device) {
+				status = hcd->driver->reset_device(hcd, udev);
+				if (status == 0)
+					usb_set_device_state(udev, USB_STATE_DEFAULT);
+				else
+					usb_set_device_state(udev, USB_STATE_NOTATTACHED);
+			} else
+				usb_set_device_state(udev, USB_STATE_DEFAULT);
 		}
 	} else {
 		if (udev)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 7cf6621..be8fd61 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1272,9 +1272,9 @@ void xhci_handle_command_timeout(unsigned long data)
 		spin_unlock_irqrestore(&xhci->lock, flags);
 		xhci_dbg(xhci, "Command timeout\n");
 		ret = xhci_abort_cmd_ring(xhci);
+		xhci_cleanup_command_queue(xhci);
 		if (unlikely(ret == -ESHUTDOWN)) {
 			xhci_err(xhci, "Abort command ring failed\n");
-			xhci_cleanup_command_queue(xhci);
 			usb_hc_died(xhci_to_hcd(xhci)->primary_hcd);
 			xhci_dbg(xhci, "xHCI host controller is dead.\n");
 		}
-- 
1.7.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ