[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.00.0811130839250.9837@eleven.esr.bruker.de>
Date: Thu, 13 Nov 2008 09:04:11 +0100 (CET)
From: Michael Dressel <Michael.Dressel@...ker-biospin.de>
To: Ian Kent <raven@...maw.net>
cc: Michael Dressel <Michael.Dressel@...ker-biospin.de>,
linux-kernel@...r.kernel.org, bart.vanassche@...il.com
Subject: Re: pthread_mutex_lock hangs on unlocked mutex
On Thu, 13 Nov 2008, Ian Kent wrote:
> On Fri, 7 Nov 2008, Michael Dressel wrote:
>
>> Hi,
>>
>> (I'm not subscribed to the list, please CC me.)
>>
>> in our software three processes are using several pthread mutexes.
>> Sometimes a process hangs inside pthread_mutex_lock even though the mutex is
>> not locked. I can tell it's not locked because another process is still
>> running and locking and unlocking the mutex.
>
> pthreads is implemented in glibc.
> If you really think there is a bug in the ptheads implementation then
> the glibc maintainers will require you to produce a simple example program
> which demonstrates the bug before it's accepted as a bug.
Yes.
I have not found any report related exactly to my problem in the
mailing lists or bug reports. But to be sure I didn't overlook something
I posted my problem. It looks like it's unique to me.
I failed to produce a simple example demonstrating the problem. In our
code we use timers and real time signals and we change process masks
with sigprocmask. If there is a bug at all (I don't think so) a
program to demonstrate that bug would potentially have to do all of
these things and would therefore not be simple.
Following Bart Van Assche's suggestion.
I did use valgrind (the default tool and helgrind) but I did not find
anything obviously related to my problem.
>
> When you say processes you mean threads, right?
>
No. We don't use threads. The mutexes are used between processes. I used
them because they feature recursion.
> If you can't produce such an example program and you can you prove (to
> yourself) there are no use after free or execution order issues with your
> code then your only option is to develop a workaround.
>
I found a workaround. We use normal semaphores now. This is possible
because we don't use multiple threads. In order to provide recursion I
had to implement a per process counter. This would not work if the
semaphore was required during signal handler execution. But this dose not
happen in our application.
> You code wouldn't happen to be doing thread synchronization with
> pthread_cond_wait()/pthread_cond_signal() would it?
No since we don't use multiple threads.
The reason why I wondered it is an issue (maybe configuration) of the
kernel was that sending a STOP CONT signal sequence to the hanging
process got it going again. So at least it is not a classical dead lock.
Cheers,
Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists