lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 8 Oct 2016 17:53:49 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Peter Zijlstra <peterz@...radead.org>
cc:     mingo@...nel.org, juri.lelli@....com, rostedt@...dmis.org,
        xlpang@...hat.com, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
        jdesfossez@...icios.com, bristot@...hat.com
Subject: Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI

On Fri, 7 Oct 2016, Peter Zijlstra wrote:
> Solve all that by:
> 
>  - using futex specific rt_mutex calls that lack the fastpath, futexes
>    have their own fastpath anyway. This makes that
>    rt_mutex_futex_unlock() doesn't need to drop rt_mutex::wait_lock
>    and the unlock is guaranteed if we manage to update user state.
> 
>  - make futex_unlock_pi() drop hb->lock early and only use
>    rt_mutex::wait_lock to serialize against rt_mutex waiters
>    update the futex value and unlock.
> 
>  - in case futex and rt_mutex disagree on waiters, side with rt_mutex
>    and simply clear the user value. This works because either there
>    really are no waiters left, or futex_lock_pi() triggers the
>    lock-steal path and fixes up the WAITERS flag.

I stared at this for a few hours and while I'm not yet done analyzing all
possible combinations I found at least one thing which is broken:

CPU 0				CPU 1

unlock_pi(f)
  ....
  unlock(hb->lock)
  *f = new_owner_tid | WAITERS;

				lock_pi(f) 
				  lock(hb->lock)
				  uval = *f;
				  topwaiter = futex_top_waiter();
				    attach_to_pi_state(uval, topwaiter->pistate);
				      pid = uval & TID_MASK;
				      if (pid != task_pid_vnr(pistate->owner))
				      	 return -EINVAL;
  ....
  pistate->owner = newowner;

So in this case we tell the caller on CPU 1 that the futex is in
inconsistent state, because pistate->owner still points to the unlocking
task while the user space value alread shows the new owner. So this sanity
check triggers and we simply fail while we should not. It's [10] in the
state matrix above attach_to_pi_state().

I suspect that there are more issues like this, especially since I did not
look at requeue_pi yet, but by now my brain is completely fried.

Thanks,

	tglx


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ