[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140514100705.GH30445@twins.programming.kicks-ass.net>
Date: Wed, 14 May 2014 12:07:05 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Carlos O'Donell <carlos@...hat.com>,
Darren Hart <dvhart@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>,
Dave Jones <davej@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Darren Hart <darren@...art.com>,
Davidlohr Bueso <davidlohr@...com>,
Ingo Molnar <mingo@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Clark Williams <williams@...hat.com>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Roland McGrath <roland@...k.frob.com>,
Jakub Jelinek <jakub@...hat.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [patch 0/3] futex/rtmutex: Fix issues exposed by trinity
On Wed, May 14, 2014 at 11:53:44AM +0200, Thomas Gleixner wrote:
> > What error would we return?
> >
> > This particular case is a serious error for which we have no good error code
> > to return to userspace. It's an implementation defect, a bug, we should probably
> > assert instead of pausing.
>
> Errm.
>
> http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_lock.html
>
> The pthread_mutex_lock() function may fail if:
>
> [EDEADLK]
> The current thread already owns the mutex.
>
> That's a exactly the error code, which the kernel returns when it
> detects a deadlock.
>
> And glibc returns EDEADLK at a lot of places already. So in that case
> it's not a serious error? Because it's detected by glibc. You can't be
> serious about that.
>
> So why is a kernel detected deadlock different? Because it detects not
> only AA, it detects ABBA and more. But it's still a dead lock. And
> while posix spec only talks about AA, it's the very same issue.
>
> So why not propagate this to the caller so he gets an alert right away
> instead of letting him attach a debugger, and scratch his head and
> lookup glibc source to find out why the hell glibc called pause.
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html
The pthread_mutex_lock() function may fail if:
[EDEADLK]
A deadlock condition was detected or the current thread already owns the mutex.
Which is explicitly wider than the AA recursion and fully supports the
full lock graph traversal we do.
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists