[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170405215858.GC13494@fury>
Date: Wed, 5 Apr 2017 14:58:58 -0700
From: Darren Hart <dvhart@...radead.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: tglx@...utronix.de, mingo@...nel.org, juri.lelli@....com,
rostedt@...dmis.org, xlpang@...hat.com, bigeasy@...utronix.de,
linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
jdesfossez@...icios.com, bristot@...hat.com
Subject: Re: [PATCH -v6 07/13] futex: Rework inconsistent rt_mutex/futex_q
state
On Wed, Mar 22, 2017 at 11:35:54AM +0100, Peter Zijlstra wrote:
> There is a weird state in the futex_unlock_pi() path when it
> interleaves with a concurrent futex_lock_pi() at the point where it
> drops hb->lock.
>
> In this case, it can happen that the rt_mutex wait_list and the
> futex_q disagree on pending waiters, in particular rt_mutex will find
> no pending waiters where futex_q thinks there are.
>
> In this case the rt_mutex unlock code cannot assign an owner.
>
> What the current code does in this case is use the futex_q waiter that
> got us here; however when the rt_mutex_timed_futex_lock() has already
> failed; this leaves things in a weird state, resulting in much
> head-aches in fixup_owner().
>
> Simplify all this by changing wake_futex_pi() to return -EAGAIN when
> this situation occurs. This then gives the futex_lock_pi() code the
> opportunity to continue and the retried futex_unlock_pi() will now
> observe a coherent state.
>
> The only problem is that this breaks RT timeliness guarantees. That
> is, consider the following scenario:
>
> T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
>
> CPU0
>
> T1
> lock_pi()
> queue_me() <- Waiter is visible
>
> preemption
>
> T2
> unlock_pi()
> loops with -EAGAIN forever
>
> Which is undesirable for PI primitives. Future patches will rectify
> this. For now we want to get rid of the fixup magic.
Errrrm... OK... I don't like the idea of having this broken after this commit,
but until I internalize the remaining 5 (that number has never seemed quite so
dauntingly large before... 5...) I can't comment on the alternative. I suppose
having it documented in the commit log means anyone backporting only up to this
point gets what they deserve.
A good patch *removing* code from futex.c is always nice though !
--
Darren Hart
VMware Open Source Technology Center
Powered by blists - more mailing lists