linux-kernel - Re: pthread_mutex_lock hangs on unlocked mutex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 13 Nov 2008 09:04:11 +0100 (CET)
From:	Michael Dressel <Michael.Dressel@...ker-biospin.de>
To:	Ian Kent <raven@...maw.net>
cc:	Michael Dressel <Michael.Dressel@...ker-biospin.de>,
	linux-kernel@...r.kernel.org, bart.vanassche@...il.com
Subject: Re: pthread_mutex_lock hangs on unlocked mutex

On Thu, 13 Nov 2008, Ian Kent wrote:

> On Fri, 7 Nov 2008, Michael Dressel wrote:
>
>> Hi,
>>
>> (I'm not subscribed to the list, please CC me.)
>>
>> in our software three processes are using several pthread mutexes.
>> Sometimes a process hangs inside pthread_mutex_lock even though the mutex is
>> not locked. I can tell it's not locked because another process is still
>> running and locking and unlocking the mutex.
>
> pthreads is implemented in glibc.
> If you really think there is a bug in the ptheads implementation then
> the glibc maintainers will require you to produce a simple example program
> which demonstrates the bug before it's accepted as a bug.

Yes.
I have not found any report related exactly to my problem in the
mailing lists or bug reports. But to be sure I didn't overlook something
I posted my problem. It looks like it's unique to me.

I failed to produce a simple example demonstrating the problem. In our
code we use timers and real time signals and we change process masks
with sigprocmask. If there is a bug at all (I don't think so) a
program to demonstrate that bug would potentially have to do all of
these things and would therefore not be simple.

Following Bart Van Assche's suggestion.
I did use valgrind (the default tool and helgrind) but I did not find
anything obviously related to my problem.

>
> When you say processes you mean threads, right?
>

No. We don't use threads. The mutexes are used between processes. I used
them because they feature recursion.

> If you can't produce such an example program and you can you prove (to
> yourself) there are no use after free or execution order issues with your
> code then your only option is to develop a workaround.
>

I found a workaround. We use normal semaphores now. This is possible
because we don't use multiple threads. In order to provide recursion I
had to implement a per process counter. This would not work if the
semaphore was required during signal handler execution. But this dose not
happen in our application.

> You code wouldn't happen to be doing thread synchronization with
> pthread_cond_wait()/pthread_cond_signal() would it?

No since we don't use multiple threads.

The reason why I wondered it is an issue (maybe configuration) of the
kernel was that sending a STOP CONT signal sequence to the hanging
process got it going again. So at least it is not a classical dead lock.

Cheers,
 	Michael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/