lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58CFE56C.8050906@linux.intel.com>
Date:   Mon, 20 Mar 2017 16:21:32 +0200
From:   Mathias Nyman <mathias.nyman@...ux.intel.com>
To:     Diego Viola <diego.viola@...il.com>, mathias.nyman@...el.com,
        Roger <rogerable@...ltek.com>
Cc:     Ulf Hansson <ulf.hansson@...aro.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        Wei WANG <wei_wang@...lsil.com.cn>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        Alan Stern <stern@...land.harvard.edu>
Subject: Re: Dell Inspiron 5558/0VNM2T hangs at resume from suspend when USB 3
 is enabled

On 19.03.2017 23:29, Diego Viola wrote:
> On Fri, Mar 17, 2017 at 5:18 PM, Diego Viola <diego.viola@...il.com> wrote:
>> On Fri, Mar 17, 2017 at 1:57 PM, Diego Viola <diego.viola@...il.com> wrote:
>>> On Fri, Mar 17, 2017 at 1:24 PM, Alan Stern <stern@...land.harvard.edu> wrote:
>>>> On Fri, 17 Mar 2017, Diego Viola wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Here's the log to the netconsole dmesg capture, I've used
>>>>> xhci_hcd.dyndbg no_console_suspend=1 as the kernel parameters.
>>>>>
>>>>> I did the usual suspend/resume cycle with i3lock, it hung after the
>>>>> third attempt when trying to resume from suspend.
>>>>>
>>>>> https://bugzilla.kernel.org/attachment.cgi?id=255309
>>>>
>>>> I'm not an expert on xHCI.  This should be CC'ed to the xhci-hcd
>>>> maintainer.
>>>>
>>>> Alan Stern
>>>>
>>>>>
>>>>> Please let me know if I should provide something else.
>>>>>
>>>>> Thanks,
>>>>> Diego
>>>>>
>>>>
>>>
>>> I've forwarded my email to Mathias Nyman.
>>>
>>> Diego
>>
>> Still a problem with 4.11.0-rc2-ARCH+
>>

 From a quick glance it looks like rtsx_usb_ms probaly takes a mutex (&ucr->dev_mutex)
and then issues a usb_bulk_msg() and waits for it to complete with mutex held.
The usb message times out, usb core kills the urb but the URB probably never gets completed,
and function never returns.

Everyting using ucr->dev_mutex would block, for example the kthread, rtsx_usb_detect_ms_card
that continuously tries to detect a ms card, takes and releases the same ucr->dev_mutex for
each try.

[  614.026502] INFO: task kworker/u8:0:5 blocked for more than 120 seconds.
[  614.027865]       Not tainted 4.10.3-1-ARCH #1
[  614.029116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  614.030467] kworker/u8:0    D    0     5      2 0x00000000
[  614.031812] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
[  614.033179] Call Trace:
[  614.034550]  __schedule+0x22f/0x700
[  614.035940]  schedule+0x3d/0x90
[  614.037334]  schedule_preempt_disabled+0x15/0x20
[  614.038680]  __mutex_lock_slowpath+0x19b/0x2d0
[  614.040067]  ? flush_workqueue+0x204/0x580
[  614.041456]  mutex_lock+0x23/0x30
[  614.042163]  acpi_device_hotplug+0x43/0x3e7
[  614.042882]  acpi_hotplug_work_fn+0x1e/0x29
[  614.043612]  process_one_work+0x1e5/0x470
[  614.044356]  worker_thread+0x48/0x4e0
[  614.045077]  kthread+0x101/0x140
[  614.045788]  ? process_one_work+0x470/0x470
[  614.046495]  ? kthread_create_on_node+0x60/0x60
[  614.047215]  ret_from_fork+0x2c/0x40
[  614.047950] INFO: task rtsx_usb_ms_1:235 blocked for more than 120 seconds.
[  614.048697]       Not tainted 4.10.3-1-ARCH #1
[  614.049465] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  614.050265] rtsx_usb_ms_1   D    0   235      2 0x00000000
[  614.051064] Call Trace:
[  614.051841]  __schedule+0x22f/0x700
[  614.052626]  schedule+0x3d/0x90
[  614.053411]  usb_kill_urb.part.4+0x6c/0xa0 [usbcore]
[  614.054198]  ? wake_atomic_t_function+0x60/0x60
[  614.055005]  usb_kill_urb+0x21/0x30 [usbcore]
[  614.055819]  usb_start_wait_urb+0xe5/0x170 [usbcore]
[  614.056652]  usb_bulk_msg+0xbd/0x160 [usbcore]
[  614.057489]  rtsx_usb_send_cmd+0x63/0x90 [rtsx_usb]
[  614.058306]  rtsx_usb_read_register+0x6c/0xc0 [rtsx_usb]
[  614.059118]  rtsx_usb_detect_ms_card+0x98/0x120 [rtsx_usb_ms]

There is a lot going on in xhci during the last suspend befor this.
URBs are canceled, devices reset and re-enumerated, timeout while reading descriptor,
device firmware changed.

  It's possible we end up in a situation where xhci never givers back the URB.

4.11-rc2 has better xhci tracing, it shows each URB enqueue and dequeue and giveback.

Could you try enabling xhci tracing before suspending (not the same as xhci verbose dynamic debug)
It will generate a lot of data, so better to remove all extra USB devices.

xhci tracing can be added with:

mount -t debugfs none /sys/kernel/debug
echo xhci-hcd >> /sys/kernel/debug/tracing/set_event

and then send the output of cat /sys/kernel/debug/tracing/trace

-Mathias


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ