lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170405215858.GC13494@fury>
Date:   Wed, 5 Apr 2017 14:58:58 -0700
From:   Darren Hart <dvhart@...radead.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     tglx@...utronix.de, mingo@...nel.org, juri.lelli@....com,
        rostedt@...dmis.org, xlpang@...hat.com, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
        jdesfossez@...icios.com, bristot@...hat.com
Subject: Re: [PATCH -v6 07/13] futex: Rework inconsistent rt_mutex/futex_q
 state

On Wed, Mar 22, 2017 at 11:35:54AM +0100, Peter Zijlstra wrote:
> There is a weird state in the futex_unlock_pi() path when it
> interleaves with a concurrent futex_lock_pi() at the point where it
> drops hb->lock.
> 
> In this case, it can happen that the rt_mutex wait_list and the
> futex_q disagree on pending waiters, in particular rt_mutex will find
> no pending waiters where futex_q thinks there are.
> 
> In this case the rt_mutex unlock code cannot assign an owner.
> 
> What the current code does in this case is use the futex_q waiter that
> got us here; however when the rt_mutex_timed_futex_lock() has already
> failed; this leaves things in a weird state, resulting in much
> head-aches in fixup_owner().
> 
> Simplify all this by changing wake_futex_pi() to return -EAGAIN when
> this situation occurs. This then gives the futex_lock_pi() code the
> opportunity to continue and the retried futex_unlock_pi() will now
> observe a coherent state.
> 
> The only problem is that this breaks RT timeliness guarantees. That
> is, consider the following scenario:
> 
>   T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
> 
>     CPU0
> 
>     T1
>       lock_pi()
>       queue_me()  <- Waiter is visible
> 
>     preemption
> 
>     T2
>       unlock_pi()
> 	loops with -EAGAIN forever
> 
> Which is undesirable for PI primitives. Future patches will rectify
> this. For now we want to get rid of the fixup magic.

Errrrm... OK... I don't like the idea of having this broken after this commit,
but until I internalize the remaining 5 (that number has never seemed quite so
dauntingly large before... 5...) I can't comment on the alternative. I suppose
having it documented in the commit log means anyone backporting only up to this
point gets what they deserve.

A good patch *removing* code from futex.c is always nice though !

-- 
Darren Hart
VMware Open Source Technology Center

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ