lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51421B89.6020308@web.de>
Date:	Thu, 14 Mar 2013 19:48:41 +0100
From:	Soeren Moch <smoch@....de>
To:	Alan Stern <stern@...land.harvard.edu>,
	Arnd Bergmann <arnd@...db.de>
CC:	USB list <linux-usb@...r.kernel.org>,
	Jason Cooper <jason@...edaemon.net>,
	Andrew Lunn <andrew@...n.ch>,
	Sebastian Hesselbarth <sebastian.hesselbarth@...il.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] USB: EHCI: fix for leaking isochronous data

On 10.03.2013 21:59, Alan Stern wrote:
> On Sun, 10 Mar 2013, Soeren Moch wrote:
>>> On Wed, 20 Feb 2013, Soeren Moch wrote:
>>>
>>>> Ok. I use 2 em2840-based usb sticks (em28xx driver) attached to a
>>>> Marvell Kirkwood-SoC with a orion-ehci usb controller. These usb sticks
>>>> stream dvb data (digital TV) employing isochronous usb transfers (user
>>>> application is vdr).
>>>>
>>>> Starting from linux-3.6 I see
>>>>      ERROR: 1024 KiB atomic DMA coherent pool is too small!
>>>> in the syslog after several 10 minutes (sometimes hours) of streaming
>>>> and then streaming stops.
>>>>
>>>> In linux-3.6 the memory management for the arm architecture was changed,
>>>> so that atomic coherent dma allocations are served from a special pool.
>>>> This pool gets exhausted. The only user of this pool (in my test) is
>>>> orion-ehci. Although I have only 10 URBs in flight (5 for each stick,
>>>> resubmitted in the completion handler), I have 256 atomic coherent
>>>> allocations (memory from the pool is allocated in pages) from orion-ehci
>>>> when I see this error. So I think there must be a memory leak (memory
>>>> allocated atomic somewhere below the usb_submit_urb call in em28xx-core.c).
>>>>
>>>> With other dvb sticks using usb bulk transfers I never see this error.
>>>>
>>>> Since you already found a memory leak in the ehci driver for isoc
>>>> transfers, I hoped you can help to solve this problem. If there are
>>>> additional questions, please ask. If there is something I can test, I
>>>> would be glad to do so.
>>>
>>> I guess the first thing is to get a dmesg log showing the problem.  You
>>> should build a kernel with CONFIG_USB_DEBUG enabled and post the part
>>> of the dmesg output starting from when you plug in the troublesome DVB
>>> stick.
>>
>> Sorry for my late response. Now I built a kernel 3.8.0 with usb_debug
>> enabled. See below for the syslog of device plug-in.
>>
>>> It also might help to have a record of all the isochronous-related
>>> coherent allocations and deallocations done by the ehci-hcd driver.
>>> Are you comfortable making your own debugging changes?  The allocations
>>> are done by a call to dma_pool_alloc() in
>>> drivers/usb/host/ehci-sched.c:itd_urb_transaction() if the device runs
>>> at high speed and sitd_urb_transaction() if the device runs at full
>>> speed.  The deallocations are done by calls to dma_pool_free() in
>>> ehci-timer.c:end_free_itds().
>>>
>>
>> I added a debug message to
>> drivers/usb/host/ehci-sched.c:itd_urb_transaction() to log the
>> allocation flags, see log below.
>
> But it looks like you didn't add a message to end_free_itds(), so we
> don't know when the memory gets deallocated.  And you didn't print out
> the values of urb, num_itds, and i, or the value of itd (so we can
> match up allocations against deallocations).

OK, I will implement this more detailed logging. But with several 
allocations per second and runtime of several hours this will result in 
a very long logfile.

>> For me this looks like nothing is
>> allocated atomic here, so this function should not be the root cause of
>> the dma coherent pool exhaustion.
>
> I don't understand.  If non-atomic allocations can't exhaust the pool,
> why do we see these allocations fail?

Good point. Unfortunately I'm not familiar with the memory management 
details.

Arnd, can memory allocated with dma_pool_alloc() and gfp_flags 
0x20000093 or 0x80000093 come from the atomic dma coherent pool?

>> Are there other allocation functions
>> which I could track?
>
> Yes, but they wouldn't be used for isochronous transfers.  See
> ehci_qtd_alloc(), ehci_qtd_free(), ehci_qh_alloc(), and qh_destroy() in
> ehci-mem.c, as well as some other one-time-only coherent allocations in
> that file.
>
> Alan Stern
>
Soeren Moch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ