linux-kernel - Re: [PATCH] futex: robust futex maybe never be awaked, on rare situation.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1909231545560.2003@nanos.tec.linutronix.de>
Date:   Mon, 23 Sep 2019 16:09:00 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Peter Zijlstra <peterz@...radead.org>
cc:     Yunfeng Cui <cui.yunfeng@....com.cn>, christian@...uner.io,
        keescook@...omium.org, luto@...capital.net, wad@...omium.org,
        akpm@...ux-foundation.org, mingo@...nel.org, mhocko@...e.com,
        elena.reshetova@...el.com, aarcange@...hat.com, ldv@...linux.org,
        arunks@...eaurora.org, guro@...com, joel@...lfernandes.org,
        viro@...iv.linux.org.uk, linux-kernel@...r.kernel.org,
        xue.zhihong@....com.cn, wang.yi59@....com.cn,
        jiang.xuexin@....com.cn
Subject: Re: [PATCH] futex: robust futex maybe never be awaked, on rare
 situation.

On Mon, 23 Sep 2019, Peter Zijlstra wrote:
> On Mon, Sep 23, 2019 at 11:18:20AM +0800, Yunfeng Cui wrote:
> > I use model checker find a issue of robust and pi futex. On below
> > situation, the owner can't find something in pi_state_list, while
> > the requester will be blocked, never be awaked.
> > 
> > CPU0                       CPU1
> >                            futex_lock_pi
> >                            /*some cs code*/
> > futex_lock_pi
> >   futex_lock_pi_atomic
> >     ...
> >     newval = uval | FUTEX_WAITERS;
> >     ret = lock_pi_update_atomic(uaddr, uval, newval);
> >     ...
> >     attach_to_pi_owner
> >      ....
> >      p = find_get_task_by_vpid(pid);
> >      if (!p)
> >        return handle_exit_race(uaddr, uval, NULL);
> >        ....
> >        raw_spin_lock_irq(&p->pi_lock);
> >        ....
> >        pi_state = alloc_pi_state();
> >        ....
> >                            do_exit->mm_release
> >                            if (unlikely(tsk->robust_list)) {
> >                              exit_robust_list(tsk);
> >                              tsk->robust_list = NULL;
> >                            }
> >                            if (unlikely(!list_empty(&tsk->pi_state_list)))
> >                              exit_pi_state_list(tsk); /*WILL MISS*/
> >       list_add(&pi_state->list, &p->pi_state_list);
> >     WILL BLOCKED, NEVER WAKEUP!
> 
> Did you forget/overlook the pi_lock fiddling in do_exit() ? I'm thinking
> that would make the above impossible.

Right. I was trying to construct a case which allows the above, but failed
to do so.

Let's look at the exiting task:

   exit()
     exit_signals()
       tsk->flags |= PF_EXITING;

     smp_mb();

     raw_spin_lock_irq(&tsk->pi_lock);
(1)
     raw_spin_unlock_irq(&tsk->pi_lock);

     exit_mm()
       mm_release()
         exit_robust_list()

         if (!list_empty(&tsk->pi_state_list)))
            exit_pi_state_list(tsk);

And now at the attaching task:

attach_to_pi_owner()
  raw_spin_lock_irq(tsk->pi_lock);
    if (tsk->flags & PF_EXITING)
      return;

  pi_state = alloc_pi_state()

  list_add(pi_state, tsk->pi_state_list);

See (1) above. That's the crucial point.

Once the exiting task has set PF_EXITING and acquired tsk->pi_lock, it is
impossible for the attaching task to queue itself as it _must_ observe
PF_EXITING after it acquired tsk->pi_lock.

If it manages to acquire tsk->pi_lock _before_ the existing task does that,
then it either observes PF_EXITING or not.

If it does, it goes out. If it does not, it queues itself on
tsk->pi_state_list and will be cleaned up by the exiting task.

Simplified concurrency picture:

Case 1: Attacher does not see PF_EXITING

CPU 0				CPU 1

    				lock(&tsk->pi_lock);
 tsk->flags |= PF_EXITING;	if (!(tsk->flags & PF_EXITING))
				   queue(pi_state, tsk);
 smp_mb();			   unlock(&tsk->pi_lock); 	       			
 lock(&tsk->pi_lock);
(1)
 unlock(&tsk->pi_lock);

 if (!list_empty(&tsk->pi_state_list)))
   exit_pi_state_list(tsk);

Case 2: Attacher does see PF_EXITING before (1)

CPU 0				CPU 1

    				lock(&tsk->pi_lock);
 tsk->flags |= PF_EXITING;	if (tsk->flags & PF_EXITING) {
 	    			   unlock(&tsk->pi_lock);
 smp_mb();			   return;
 lock(&tsk->pi_lock);           }
(1) 
 unlock(&tsk->pi_lock);

The attacher CANNOT be queued in tsk->pi_state_list

Case 2: Attacher does see PF_EXITING after (1)

CPU 0				CPU 1

 tsk->flags |= PF_EXITING;	
 smp_mb();
 lock(&tsk->pi_lock);
				lock(&tsk->pi_lock);
(1) 
 unlock(&tsk->pi_lock);
				if (tsk->flags & PF_EXITING) {
 	    			   unlock(&tsk->pi_lock);
				   return;
				}

There are no other cases possible. If the attacher can observe

      !(tsk->flags & PF_EXITING)

_after_ (1) then there is something seriously wrong, but not in the futex
code. That would be a massive memory ordering issue.

Thanks,

	tglx