[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F582AC1.6040300@linux.intel.com>
Date: Wed, 07 Mar 2012 19:42:57 -0800
From: Darren Hart <dvhart@...ux.intel.com>
To: jon@...shouse.co.uk
CC: Huang Shijie <shijie8@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: Random process lockup on ARM board: alsa-lib-1.0.25, FUTEX_WAIT_PRIVATE
On 03/07/2012 01:22 PM, Jonathan Andrews wrote:
> On Wed, 2012-03-07 at 12:07 -0800, Darren Hart wrote:
>>
>> On 02/29/2012 01:12 AM, Huang Shijie wrote:
>>> Hi ,
>>>
>>> I meet a similar problem with the latest futex code.
>>>
>>> I play the video and the processes will hang at the futex.
>>
>> Are either of you able to bisect the kernel?
> I'm not a kernel hacker what do you mean ?
Google for "git bisect". It's a way to divide-and-conquer and find
exactly where the kernel stops working for you. You would need a known
working kernel though. If you don't have a known working kernel, then
this is either a long standing bug (not a regression) or, I suspect more
likely, a locking bug or race in the userspace code.
When you see your thread stuck in FUTEX_WAIT_PRIVATE, that in and of
itself is not indication of a problem. It's waiting there to be woken
(FUTEX_WAKE_PRIVATE) by another thread in your application. So the
questions you should be asking are:
Why is it blocked?
pthread_cond_wait()?
pthread_mutex_lock()?
Why isn't it being woken up?
Did you call pthread_cond_broadcast() without taking the mutex
first?
There are other API that use futexes under the covers. You should be
able to sort out where in your application your threads are blocked and
thus determine which API you are using - and from there how you were
expecting that to get woken up. These calls may be being made from
within ALSA, in which case you may need to elicit the help of the ALSA
developers.
--
Darren
>
>> At the very least can you
>> find two kernels where it works and where it does not?
>>
>> Hanging on FUTEX_WAIT_PRIVATE can be the symptom for higher level
>> problems including userspace locking issues and race conditions.
>
>
> My workload is UDP network audio. I have compiled my code with and
> without ALSA support. The version without ALSA seems to run forever, the
> version with ALSA works on ARM for between a few minutes and a few
> hours. On Intel the same futex stall problem occurs, but it may take
> runtime of days.
>
> I have two processes running. One RX process that takes UDP packets from
> the network mixes them and presents them to ALSA as an audio stream, the
> second process takes audio from the sound device and transmits it as a
> UDP audio stream. The two processes are independent.
>
> My workload is atypical as I need to both transmit and receive audio via
> UDP on a 27/7 basis.
>
> So far I have experienced the problem on 3 kernels, but I have tried
> only 3 kernels it may be all 2.6 kernels that suffer.
>
> My development PC is "Linux jonspc 2.6.32.26-175.fc12.i686 #1 SMP Wed
> Dec 1 21:52:04 UTC 2010 i686 athlon i386 GNU/Linux"
>
> My ARM board target:
> ARM / # uname -a
> Linux (none) 3.2.5 #2 Wed Feb 22 17:11:52 GMT 2012 armv4tl GNU/Linux
>
> And my ARM target running its older kernel was (2.6.36).
>
> I have an strace of the process running and stalling on the PC.
> The file is 2GB, its not a fast link sorry.
>
> http://www.jonshouse.co.uk/download/a_stop.txt
>
>
> Many thanks,
> Jon
>
>
>
--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Linux Kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists