[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <537650BD.8050105@redhat.com>
Date: Fri, 16 May 2014 13:54:05 -0400
From: "Carlos O'Donell" <carlos@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: Peter Zijlstra <peterz@...radead.org>,
Darren Hart <dvhart@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Dave Jones <davej@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Darren Hart <darren@...art.com>,
Davidlohr Bueso <davidlohr@...com>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Clark Williams <williams@...hat.com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Roland McGrath <roland@...k.frob.com>,
Jakub Jelinek <jakub@...hat.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [patch 0/3] futex/rtmutex: Fix issues exposed by trinity
On 05/14/2014 07:11 PM, Thomas Gleixner wrote:
> On Wed, 14 May 2014, Carlos O'Donell wrote:
>> On 05/14/2014 05:22 AM, Peter Zijlstra wrote:
>>>>> I believe the thinking goes that if we get to here, then the lock is in an
>>>>> inconsistent state (between kernel and userspace). I don't have an answer for
>>>>> why pausing forever would be preferable to returning an error however...
>>>>
>>>> What error would we return?
>>>
>>> EDEADLK is a valid user return for pthread_mutex_lock() as per:
>>>
>>> http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html
>>
>> How is that correct? It isn't a deadlock we've detected but inconsistent
>> state between glibc and the kernel. In this case glibc should assert.
>> Delaying indefinitely with pause() never seems correct (despite that being
>> what we do today).
>
> If there is inconsistent state detected then the kernel will return
> -EPERM or -EINVAL. So lets put inconsistent state aside.
OK.
> In glibc you only can detect the simple AA dead lock, i.e lock owner
> tries to lock the lock it owns again. Trivial, right ?
Agreed.
> But glibc has no idea which lock chains are involved and might lead to
> a dead lock caused by nested locking, simplest and most popular being
> ABBA.
OK.
> The kernel can (if the implementation is fixed, patch is available
> already) very well detect ABBA and even more complex nested lock
> deadlocks. So it rightfully returns -EDEADLK and that is completely
> correct versus the spec and the call site can do something about it.
OK.
> And that's not different from the glibc detected AA deadlock at
> all. It's just detected by a different mechanism.
OK.
> On kernel side we currently provide this service only for the PI
> futexes because we have a kernel side state representation as long as
> the user space state is not corrupted.
OK.
> Back then when it was implemented the dead lock detection actually
> worked and was agreed on by both sides - kernel and glibc - to be
> usefull and essential to the whole endavour.
I agree that ignoring the situation of corrupted or inconsistent
state we should be returning EDEADLK to userspace.
We'll cleanup glibc.
Cheers,
Carlos.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists