lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 21 Mar 2016 04:18:17 +0000
From:	Rajesh Bhagat <>
To:	Mathias Nyman <>,
	"" <>,
	"" <>
CC:	"" <>,
	"" <>,
	Sriram Dash <>
Subject: RE: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI
 commmand timeout

> -----Original Message-----
> From: Mathias Nyman []
> Sent: Friday, March 18, 2016 4:51 PM
> To: Rajesh Bhagat <>;; linux-
> Cc:;; Sriram Dash
> <>
> Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI
> commmand timeout
> On 18.03.2016 09:01, Rajesh Bhagat wrote:
> > We are facing issue while performing the system resume operation from
> > STR where XHCI is going to indefinite hang/sleep state due to
> > wait_for_completion API called in function xhci_alloc_dev for command
> > TRB_ENABLE_SLOT which never completes.
> >
> > Now, xhci_handle_command_timeout function is called and prints
> > "Command timeout" message but never calls complete API for above
> > TRB_ENABLE_SLOT command as xhci_abort_cmd_ring is successful.
> >
> > Solution to above problem is:
> > 1. calling xhci_cleanup_command_queue API even if xhci_abort_cmd_ring
> >     is successful or not.
> > 2. checking the status of reset_device in usb core code.
> Hi
> I think clearing the whole command ring is a bit too much in this case.
> It may cause issues for all attached devices when one command times out.

Hi Mathias, 

I understand your point, But I want to understand how would completion handler be called 
if a command is timed out and xhci_abort_cmd_ring is successful. In this case all the code 
would be waiting on completion handler forever. 

> We need to look in more detail why we fail to call completion for that one aborted
> command.

I checked the below code, Please correct me if I am wrong

code waiting on wait_for_completion: 
int xhci_alloc_dev(struct usb_hcd *hcd, struct usb_device *udev)
        ret = xhci_queue_slot_control(xhci, command, TRB_ENABLE_SLOT, 0);

        wait_for_completion(command->completion); <=== waiting for command to complete 

code calling completion handler:
1. handle_cmd_completion -> xhci_complete_del_and_free_cmd
2. xhci_handle_command_timeout -> xhci_abort_cmd_ring(failure) -> xhci_cleanup_command_queue -> xhci_complete_del_and_free_cmd

In our case command is timed out, Hence we hit the case #2 but xhci_abort_cmd_ring is success which 
does not calls complete. 

> The bigger question is why the timeout happens in the first place?

We are doing suspend resume operation, It might be controller issue :(, IMO software should not 
hang/stop if hardware is not behaving correct. 

> What kernel version, and what xhci vendor was this triggered on?

We are using 4.1.8 kernel

> It's possible that the timeout is related either to the locking issue found by Chris
> Bainbridge:
> or the resume issues in this thread, (see full thread)
> Does any of those proposed solutions fix the command timeout for you?

I will check the above patches and share status.

> -Mathias

Powered by blists - more mailing lists