lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 05 Nov 2014 13:18:50 -0500
From:	Joseph Salisbury <joseph.salisbury@...onical.com>
To:	Pratyush Anand <pratyush.anand@...com>
CC:	Greg KH <gregkh@...uxfoundation.org>,
	Alan Stern <stern@...land.harvard.edu>,
	"dan.j.williams@...el.com" <dan.j.williams@...el.com>,
	"pmladek@...e.cz" <pmladek@...e.cz>,
	Sarah Sharp <sarah.a.sharp@...ux.intel.com>,
	"peter.chen@...escale.com" <peter.chen@...escale.com>,
	USB list <linux-usb@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [Regression][v3.17][v3.18]   USB: Fix persist resume of some
 SS USB devices

On 11/05/2014 03:48 AM, Pratyush Anand wrote:
> On Wed, Nov 05, 2014 at 10:53:35AM +0530, Pratyush Anand wrote:
>> Hello Joe,
>>
>> On Wed, Nov 05, 2014 at 06:20:06AM +0800, Joseph Salisbury wrote:
>>> Hello Pratyush,
>>>
>>> A kernel bug report was opened against Ubuntu [0].  After a kernel
>>> bisect, it was found that reverting the following commit resolved this bug:
>>>
>>> commit a40178b2fa6ad87670fb1e5fa4024db00c149629
>>> Author: Pratyush Anand <pratyush.anand@...com>
>>> Date:   Fri Jul 18 12:37:10 2014 +0530
>>>
>>>     USB: Fix persist resume of some SS USB devices
>>>
>>> The regression was introduced as of v3.17-rc1.  The regression still
>>> exists in the 3.18-rc3 mainline kernel, and has made it's way into the
>>> stable kernels.
>> Its strange, as per my understanding patch does not introduce any side
>> effect for devices whose resume time is normal. So, if a device is
>> connected to the SS port and it was working after resume from
>> suspend to ram without this patch, then that device should still work
>> with this patch.
>>
>> Infact this has resolved another bug reported to bugzilla
>> https://bugzilla.kernel.org/show_bug.cgi?id=53211
>>
>>
>>> I was hoping to get your feedback, since you are the patch author.  Do
>>> you think gathering any additional data will help diagnose this issue,
>> Yaa, sure additional info will help to understand the issue.
>> -- dmesg log of suspend resume when this patch is not applied and also
>> when applied.
> I see that you have already provided both log to the buglink. I had a
> look on that. 
>
>
> When it fails (with this patch):
> Oct 22 15:38:59 mana kernel: [   59.825122] PM: resume of devices
> complete after 22223.878 msecs
>
> When it passed (without this patch):
> Oct 22 15:37:19 mana kernel: [   53.495109] PM: resume of devices
> complete after 3641.933 msecs
> However, even when patch was not present(and it passed), there is a logical
> disconnection of devices, so you would face the issue mentioned in
> https://bugzilla.kernel.org/show_bug.cgi?id=53211.
>
> Looking to the timeout, it seems that wait loop went for full length
> (2S) for all the devices, still could not find a connected device. So,
> basically SS link between root hub and hub was not up in 2 sec. Not
> sure why, but reading further hub port status caused some fatal issue
> with xhci host and so it is not able to get new device connection
> after logical disconnection.
>
> Increasing the timeout value may help. But with this long timeout I see a
> possibility of sync issue between port_event and usb_port_resume. It
> might happen that port_event reads hub port status before
> usb_port_resume handles reset_resume.
>
> Can you try following patch with step increasing varied value of
> PORT_ENABLE_WAIT, and then let me know which value of
> PORT_ENABLE_WAIT works for you (if it works ;)).
>
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index 11e80ac31324..6eaf481d3d53 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -3311,13 +3311,14 @@ static int finish_port_resume(struct usb_device *udev)
>   * This routine should only be called when persist is enabled for a SS
>   * device.
>   */
> +#define PORT_ENABLE_WAIT	2000
>  static int wait_for_ss_port_enable(struct usb_device *udev,
>  		struct usb_hub *hub, int *port1,
>  		u16 *portchange, u16 *portstatus)
>  {
>  	int status = 0, delay_ms = 0;
>  
> -	while (delay_ms < 2000) {
> +	while (delay_ms < PORT_ENABLE_WAIT) {
>  		if (status || *portstatus & USB_PORT_STAT_CONNECTION)
>  			break;
>  		msleep(20);
> @@ -4881,6 +4882,22 @@ static void hub_port_connect_change(struct usb_hub *hub, int port1,
>  	usb_lock_port(port_dev);
>  }
>  
> +static void wait_for_reset_resume(struct usb_hub *hub, int port)
> +{
> +	struct usb_device *udev;
> +	int delay_ms = 0;
> +
> +	udev = hub->ports[port -1]->child;
> +	if (udev && udev->reset_resume) {
> +		while (delay_ms < PORT_ENABLE_WAIT) {
> +			if (!udev->reset_resume)
> +				break;
> +			msleep(20);
> +			delay_ms += 20;
> +		}
> +	}
> +}
> +
>  static void port_event(struct usb_hub *hub, int port1)
>  		__must_hold(&port_dev->status_lock)
>  {
> @@ -4894,6 +4911,8 @@ static void port_event(struct usb_hub *hub, int port1)
>  	clear_bit(port1, hub->event_bits);
>  	clear_bit(port1, hub->wakeup_bits);
>  
> +	wait_for_reset_resume(hub, port1);
> +
>  	if (hub_port_status(hub, port1, &portstatus, &portchange) < 0)
>  		return;
>  
> Regards
> Pratyush

Thanks for the feedback, Pratyush.  We'll test your patch and get back
to you.


>
>> -- dmesg log with following additional debug print.
>> @@ -3318,6 +3318,7 @@ static int wait_for_ss_port_enable(struct usb_device *udev,
>>         int status = 0, delay_ms = 0;
>>          
>>         while (delay_ms < 2000) {
>> +               printk("%s:portstatus is %x and portchange is %x\n", __func__, *portstatus, *portchange);
>>                if (status || *portstatus & USB_PORT_STAT_CONNECTION)
>>                         break;
>>
>>> or would it be best to submit a revert request?
>> Let me see what is the portstatus value and why didn't it break loop in first
>> iteration if device was OK.
>>
>> Regards
>> Pratyush
>>
>>>    
>>>
>>> Thanks,
>>>    
>>> Joe
>>>
>>> [0] http://pad.lv/1384041

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists