lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+ToGPFXoqfn7PyM6iDthMN0jWVC5EKUtmzc3M-gJxcy1he_nw@mail.gmail.com>
Date:   Mon, 20 Mar 2017 12:39:51 -0300
From:   Diego Viola <diego.viola@...il.com>
To:     Mathias Nyman <mathias.nyman@...ux.intel.com>
Cc:     mathias.nyman@...el.com, Roger <rogerable@...ltek.com>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        Greg KH <gregkh@...uxfoundation.org>,
        Wei WANG <wei_wang@...lsil.com.cn>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux USB List <linux-usb@...r.kernel.org>,
        Alan Stern <stern@...land.harvard.edu>
Subject: Re: Dell Inspiron 5558/0VNM2T hangs at resume from suspend when USB 3
 is enabled

On Mon, Mar 20, 2017 at 11:21 AM, Mathias Nyman
<mathias.nyman@...ux.intel.com> wrote:
> On 19.03.2017 23:29, Diego Viola wrote:
>>
>> On Fri, Mar 17, 2017 at 5:18 PM, Diego Viola <diego.viola@...il.com>
>> wrote:
>>>
>>> On Fri, Mar 17, 2017 at 1:57 PM, Diego Viola <diego.viola@...il.com>
>>> wrote:
>>>>
>>>> On Fri, Mar 17, 2017 at 1:24 PM, Alan Stern <stern@...land.harvard.edu>
>>>> wrote:
>>>>>
>>>>> On Fri, 17 Mar 2017, Diego Viola wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Here's the log to the netconsole dmesg capture, I've used
>>>>>> xhci_hcd.dyndbg no_console_suspend=1 as the kernel parameters.
>>>>>>
>>>>>> I did the usual suspend/resume cycle with i3lock, it hung after the
>>>>>> third attempt when trying to resume from suspend.
>>>>>>
>>>>>> https://bugzilla.kernel.org/attachment.cgi?id=255309
>>>>>
>>>>>
>>>>> I'm not an expert on xHCI.  This should be CC'ed to the xhci-hcd
>>>>> maintainer.
>>>>>
>>>>> Alan Stern
>>>>>
>>>>>>
>>>>>> Please let me know if I should provide something else.
>>>>>>
>>>>>> Thanks,
>>>>>> Diego
>>>>>>
>>>>>
>>>>
>>>> I've forwarded my email to Mathias Nyman.
>>>>
>>>> Diego
>>>
>>>
>>> Still a problem with 4.11.0-rc2-ARCH+
>>>
>
> From a quick glance it looks like rtsx_usb_ms probaly takes a mutex
> (&ucr->dev_mutex)
> and then issues a usb_bulk_msg() and waits for it to complete with mutex
> held.
> The usb message times out, usb core kills the urb but the URB probably never
> gets completed,
> and function never returns.
>
> Everyting using ucr->dev_mutex would block, for example the kthread,
> rtsx_usb_detect_ms_card
> that continuously tries to detect a ms card, takes and releases the same
> ucr->dev_mutex for
> each try.
>
> [  614.026502] INFO: task kworker/u8:0:5 blocked for more than 120 seconds.
> [  614.027865]       Not tainted 4.10.3-1-ARCH #1
> [  614.029116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  614.030467] kworker/u8:0    D    0     5      2 0x00000000
> [  614.031812] Workqueue: kacpi_hotplug acpi_hotplug_work_fn
> [  614.033179] Call Trace:
> [  614.034550]  __schedule+0x22f/0x700
> [  614.035940]  schedule+0x3d/0x90
> [  614.037334]  schedule_preempt_disabled+0x15/0x20
> [  614.038680]  __mutex_lock_slowpath+0x19b/0x2d0
> [  614.040067]  ? flush_workqueue+0x204/0x580
> [  614.041456]  mutex_lock+0x23/0x30
> [  614.042163]  acpi_device_hotplug+0x43/0x3e7
> [  614.042882]  acpi_hotplug_work_fn+0x1e/0x29
> [  614.043612]  process_one_work+0x1e5/0x470
> [  614.044356]  worker_thread+0x48/0x4e0
> [  614.045077]  kthread+0x101/0x140
> [  614.045788]  ? process_one_work+0x470/0x470
> [  614.046495]  ? kthread_create_on_node+0x60/0x60
> [  614.047215]  ret_from_fork+0x2c/0x40
> [  614.047950] INFO: task rtsx_usb_ms_1:235 blocked for more than 120
> seconds.
> [  614.048697]       Not tainted 4.10.3-1-ARCH #1
> [  614.049465] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  614.050265] rtsx_usb_ms_1   D    0   235      2 0x00000000
> [  614.051064] Call Trace:
> [  614.051841]  __schedule+0x22f/0x700
> [  614.052626]  schedule+0x3d/0x90
> [  614.053411]  usb_kill_urb.part.4+0x6c/0xa0 [usbcore]
> [  614.054198]  ? wake_atomic_t_function+0x60/0x60
> [  614.055005]  usb_kill_urb+0x21/0x30 [usbcore]
> [  614.055819]  usb_start_wait_urb+0xe5/0x170 [usbcore]
> [  614.056652]  usb_bulk_msg+0xbd/0x160 [usbcore]
> [  614.057489]  rtsx_usb_send_cmd+0x63/0x90 [rtsx_usb]
> [  614.058306]  rtsx_usb_read_register+0x6c/0xc0 [rtsx_usb]
> [  614.059118]  rtsx_usb_detect_ms_card+0x98/0x120 [rtsx_usb_ms]
>
> There is a lot going on in xhci during the last suspend befor this.
> URBs are canceled, devices reset and re-enumerated, timeout while reading
> descriptor,
> device firmware changed.
>
>  It's possible we end up in a situation where xhci never givers back the
> URB.
>
> 4.11-rc2 has better xhci tracing, it shows each URB enqueue and dequeue and
> giveback.
>
> Could you try enabling xhci tracing before suspending (not the same as xhci
> verbose dynamic debug)
> It will generate a lot of data, so better to remove all extra USB devices.
>
> xhci tracing can be added with:
>
> mount -t debugfs none /sys/kernel/debug
> echo xhci-hcd >> /sys/kernel/debug/tracing/set_event
>
> and then send the output of cat /sys/kernel/debug/tracing/trace
>
> -Mathias
>
>

https://bugzilla.kernel.org/attachment.cgi?id=255367

This is with Linux 4.11.0-rc3-ARCH.

USB mouse/keyboard was unplugged before booting the machine.

I didn't do a suspend/resume before getting this trace, should I do that?

Should I reproduce the hang and get a netconsole dmesg capture with
tracing enabled?

Thanks,
Diego

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ