linux-kernel - Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161125140947.GB3107@twins.programming.kicks-ass.net>
Date:   Fri, 25 Nov 2016 15:09:47 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     mingo@...nel.org, juri.lelli@....com, rostedt@...dmis.org,
        xlpang@...hat.com, bigeasy@...utronix.de,
        linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
        jdesfossez@...icios.com, bristot@...hat.com
Subject: Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI

On Thu, Oct 27, 2016 at 10:36:00PM +0200, Thomas Gleixner wrote:
> So that waiter which is now spinning on pi_mutex->lock has already set the
> waiters bit, which you undo. So you created the following scenario:
> 
> CPU0	     	       	  CPU1				CPU2
> 
> TID 0x1001		  TID 0x1000			TID 0x1002
> 
> 			  Acquires futex in user space
>     			  futex value = 0x1000;
> 
> futex_lock_pi()
>   
>   lock_hb()
>   set_waiter_bit()
>   --> futex value = 0x40001000;
> 
> 			  futex_unlock_pi()
> 			   lock_hb()
>   attach_to_pi_owner()
>     rt_mutex_init_proxy_locked()
>   queue_me()
>     unlock_hb()
> 			   unlock_hb()
>   rt_mutex_lock()	   wake_futex_pi()
>    			   lock(pi_mutex->lock);
>    lock(pi_mutex->lock)	   new_owner is NULL;
> 		 	    --> futex value = 0;
> 			   rt_mutex_futex_unlock(pi_mutex);
> 			   unlock(pi_mutex->lock);
>    acquire_rt_mutex()	   return to user space
> 							Acquires futex in user space
> 							--> futex value = 0x1002
>   fixup_owner()
>     fixup_pi_state_owner()
>        uval = 0x1002;
>        newval = 0x40001001;
>        cmpxchg(uval, newval) succeeds
>        --> futex value = 0x40001001
> 
> Voila. Inconsistent state .... TID 0x1002 is borked now.

OK, I think its much worse...


Consider:


CPU0 (tid=0x1000)		CPU1 (tid=0x1001)		CPU2 (tid=0x1002)


acquire in userspace
  *uaddr = 0x1000

				futex_lock_pi()
				  acq hb->lock
				  attach_to_pi_owner
				    pi_state->refcount == 1
				  queue_me
				    rel hb->lock
								futex_lock_pi()
								  acq hb->lock
								  attach_to_pi_state
								    pi_state->refcount == 2
								  queue_me
								    rel hb->lock

futex_unlock_pi()
  acq pi_mutex->lock
  top_waiter == (CPU1 | CPU2)
  wake_futex_pi()
    new_owner == NULL
      pi_state->owner = &init_task
      *uaddr = 0
    rt_mutex_futex_unlock()
  ret-to-user


acquire in userspace
  *uaddr = 0x1000



>From here there's multiple ways to trainwreck the thing, the way you
list above, lets call this A):

								  rt_mutex_lock()
								  acq hb->lock
								  fixup_owner()
								    cmpxchg(uaddr, 0x1000, 0x40001002)

								    *BORKED CPU0*

alternatively we can do; B):

			CPU3 (or CPU0 with a different tid)

			futex_lock_pi()
			  acq hb->lock
			  attach_to_pi_state()
			    -EINVAL : *uaddr != task_pid_vnr(&init_task)



Now, since CPU1 is pinning the hb->waiter relation, the proposed
fixup_owner() -EAGAIN change cannot work either, since, A1):

								  rt_mutex_lock()
								  acq hb->lock
								  fixup_owner() : -EAGAIN
								  unqueue_me_pi()
								    put_pi_state
								      pi_state->refcount == 1
								    rel hb->lock
								  goto retry_private

								retry_private:
								  acq hb->lock
								  attach_to_pi_state() : -EINVAL


Similar things happen with CMP_REQUEUE doing a lookup_pi_state() around
this point.

I'm sorely tempted to do the 'simple' thing and leak FUTEX_WAITERS,
forcing things into the slow path.