linux-kernel - Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 25 Nov 2016 11:03:00 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     mingo@...nel.org, juri.lelli@....com, rostedt@...dmis.org,
        xlpang@...hat.com, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
        jdesfossez@...icios.com, bristot@...hat.com
Subject: Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI

On Fri, Nov 25, 2016 at 10:23:26AM +0100, Peter Zijlstra wrote:
> On Thu, Nov 24, 2016 at 07:58:07PM +0100, Peter Zijlstra wrote:
> 
> > OK, so clearly I'm confused. So let me try again.
> > 
> > LOCK_PI, does in one function: lookup_pi_state, and fixup_owner. If
> > fixup_owner fails with -EAGAIN, we can redo the pi_state lookup.
> > 
> > The requeue stuff, otoh, has one each. REQUEUE_WAIT has fixup_owner(),
> > CMP_REQUEUE has lookup_pi_state. Therefore, fixup_owner failing with
> > -EAGAIN leaves us dead in the water. There's nothing to go back to to
> > retry.
> > 
> > So far, so 'good', right?
> > 
> > Now, as far as I understand this requeue stuff, we have 2 futexes, an
> > inner futex and an outer futex. The inner futex is always 'locked' and
> > serves as a collection pool for waiting threads.
> > 
> > The requeue crap picks one (or more) waiters from the inner futex and
> > sticks them on the outer futex, which gives them a chance to run.
> > 
> > So WAIT_REQUEUE blocks on the inner futex, but knows that if it ever
> > gets woken, it will be on the outer futex, and hence needs to
> > fixup_owner if the futex and rt_mutex state got out of sync.
> > 
> > CMP_REQUEUEUEUE picks the one (or more) waiters of the inner futex and
> > sticks them on the outer futex.
> > 
> > So far, so 'good' ?
> > 
> > The thing I'm not entire sure on is what happens with the outer futex,
> > do we first LOCK_PI it before doing CMP_REQUEUE, giving us waiters, and
> > then UNLOCK_PI to let them rip? Or do we just CMP_REQUEUE and then let
> > whoever wins finish with UNLOCK_PI?
> > 
> > 
> > In any case, I don't think it matters much, either way we can race
> > betwen the 'last' UNLOCK_PI and getting rt_mutex waiters and then hit
> > the &init_task funny state, such that WAIT_REQUEUE waking hits EAGAIN
> > and we're 'stuck'.
> > 
> > Now, if we always CMP_REQUEUE to a locked outer futex, then we cannot
> > know, at CMP_REQUEUE time, who will win and cannot fix up.
> 
> OTOH, if we always first LOCK_PI before doing CMP_REQUEUE, I don't think
> we can hit the funny state, LOCK_PI will have fixed it up for us.
> 
> So the question is, do we mandate LOCK_PI before CMP_REQUEUE?

Going by futex_requeue(), the first thing it does; after validation and
getting hbs locked, is futex_proxy_trylock_atomic(), which per the
comment above it will attempt to acquire uaddr2.

So no such mandate, otherwise that op would not exist and we'd only need
to validate that uaddr2 was 'current'.

A well, back to reading more..