lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EFB4429.4090002@lwfinger.net>
Date:	Wed, 28 Dec 2011 10:30:33 -0600
From:	Larry Finger <Larry.Finger@...inger.net>
To:	Andiry Xu <andiry.xu@....com>
CC:	Sarah Sharp <sarah.a.sharp@...ux.intel.com>,
	USB list <linux-usb@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, v4mp <gaigo88@...mail.it>
Subject: Re: Question about error from xhci-hcd

On 11/14/2011 03:18 AM, Andiry Xu wrote:
> On 11/02/2011 12:06 AM, Larry Finger wrote:
>> On 10/30/2011 12:04 AM, Sarah Sharp wrote:
>>
>>> The xHCI driver allocates a fixed-size endpoint ring, and only so much
>>> data can fit on it.  If the driver is allocating many URBs or many URBs
>>> with a lot of data, then you will see these messages and the URBs will
>>> fail to be submitted.  Now if neither of those conditions are true, then
>>> it's possible we just have a bug in the xHCI driver.
>>>
>>> There is a patchset in the works to dynamically expand the endpoint
>>> rings, but it's still going through revisions:
>>>
>>> http://marc.info/?l=linux-usb&m=131918645424329&w=2
>>
>> I have a bit more to report. Applying the above patch set did not help.
>>
>> I modified the xHCI driver from 3.1-rc10 to provide a stack dump
>> whenever the messages appeared. The "short transfer on control ep"
>> occurs before the rtl8192cu device has been plugged and has the
>> following dump, which is probably not informative:
>>
>> [    3.988197] xhci_hcd 0000:05:00.0: WARN: short transfer on control ep
>> [    3.988208] Pid: 0, comm: kworker/0:0 Not tainted
>> 3.1.0-0301rc9-generic #201110050905
>> [    3.988213] Call Trace:
>> [    3.988225]  [<c135788d>] ? dev_warn+0x2d/0x30
>> [    3.988238]  [<f80852d5>] xhci_irq+0x1035/0x1050 [xhci_hcd]
>> [    3.988249]  [<c1079827>] ? tick_program_event+0x27/0x40
>> [    3.988261]  [<f808531c>] xhci_msi_irq+0x2c/0x30 [xhci_hcd]
>> [    3.988270]  [<c10ac5b8>] handle_irq_event_percpu+0x48/0x190
>> [    3.988279]  [<c10aee40>] ? irq_set_chip_and_handler_name+0x40/0x40
>> [    3.988286]  [<c10ac73f>] handle_irq_event+0x3f/0x60
>> [    3.988294]  [<c10aee40>] ? irq_set_chip_and_handler_name+0x40/0x40
>> [    3.988301]  [<c10aee9b>] handle_edge_irq+0x5b/0xf0
>> [    3.988305]<IRQ>   [<c1546a31>] ? do_IRQ+0x41/0xb0
>> [    3.988320]  [<c1542950>] ? notifier_call_chain+0x30/0x60
>> [    3.988328]  [<c1546970>] ? common_interrupt+0x30/0x38
>> [    3.988337]  [<c104007b>] ? sched_debug_show+0x11b/0x5f0
>> [    3.988345]  [<c12e5524>] ? intel_idle+0xa4/0x100
>> [    3.988355]  [<c142833c>] ? cpuidle_idle_call+0xac/0x160
>> [    3.988364]  [<c1001c27>] ? cpu_idle+0x97/0xd0
>> [    3.988368]  [<c1537e16>] ? start_secondary+0xf6/0x110
>>
>> Just in case it is needed, the full dmesg output is attached.
>>
>> Due to wrapping of the dmesg buffer, the first few of stack dumps for
>> the "ERROR no room on ep ring" messages were lost, but the one I got
>> came from the following code fragment in
>> drivers/net/wireless/rtlwifi/usb.c at line 87:
>>
>>          usb_fill_control_urb(urb, udev, pipe,
>>                               (unsigned char *)dr, buf, len,
>>                               usbctrl_async_callback, buf);
>>          rc = usb_submit_urb(urb, GFP_ATOMIC);
>>
>> The value of len for this call is 4. The driver only uses 1, 2, or 4 as
>> the lengths of writes, at least those that go through usb_submit_urb().
>> Even the firmware download is done one dword at a time.
>>
>> We also tested with the xHCI code from the current mainline kernel, i.e.
>> 3.1-git, but I don't have the dmesg output for that version. If you have
>> any patches in the pipeline, or anything to test, please send those to me.
>>
>
> A control transfer ring should not be full. Only isoc and bulk transfer
> will cause ring full with a lot of TDs submitted simultaneously. I
> suspect the ring is mangled.
>
> Please apply the patch attached, enable CONFIG_USB_DEBUG and
> CONFIG_USB_XHCI_HCD_DEBUGGING and post the dmesg with the "no room on ep
> ring" error.

Sorry to take so long to get this diagnostic info to you.

Attached is a dmesg output. There is one of the short transfer messages at 
216.57+ seconds.

Thanks for looking at this.

Larry


View attachment "composit_dmesg" of type "text/plain" (381016 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ