lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1911051959260.1869@nanos.tec.linutronix.de>
Date:   Tue, 5 Nov 2019 20:19:15 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Oleg Nesterov <oleg@...hat.com>
cc:     Florian Weimer <fweimer@...hat.com>, Shawn Landden <shawn@....icu>,
        libc-alpha@...rceware.org, linux-api@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Deepa Dinamani <deepa.kernel@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Keith Packard <keithp@...thp.com>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: handle_exit_race && PF_EXITING

On Tue, 5 Nov 2019, Thomas Gleixner wrote:
> 
> I'm a moron. It's vfork() not fork() so the behaviour is expected.
> 
> Staring more at the trace which shows me where this goes down the drain.

 parent	    	    	child
 
  set FIFO prio 2
 
  vfork()			->	set FIFO prio 1
   implies wait_for_child()	 	sched_setscheduler(...)
 			   		exit()
					do_exit()
					tsk->flags |= PF_EXITING;
 					....
					mm_release()
					  exit_futex(); (NOOP in this case)
					  complete() --> wakes parent
 sys_futex()
    loop infinite because
    	 PF_EXITING is set,
	 but PF_EXITPIDONE not

So the obvious question is why PF_EXITPIDONE is set way after the futex
exit cleanup has run, but moving this right after exit_futex() would not
solve the exit race completely because the code after setting PF_EXITING is
preemptible. So the same crap could happen just by preemption:

  task holds futex
  ...
  do_exit()
    tsk->flags |= PF_EXITING;

preemption (unrelated wakeup of some other higher prio task, e.g. timer)

  switch_to(other_task)

  return to user
  sys_futex()
	loop infinite as above

And just for the fun of it the futex exit cleanup could trigger the wakeup
itself before PF_EXITPIDONE is set.

There is some other issue which I need to lookup again. That's a slightly
different problem but related to futex exit race conditions.

The way we can deal with that is:

    do_exit()
    tsk->flags |= PF_EXITING;
    ...
    mutex_lock(&tsk->futex_exit_mutex);
    futex_exit();
    tsk->flags |= PF_EXITPIDONE;
    mutex_unlock(&tsk->futex_exit_mutex);
    
and on the futex lock_pi side:

    if (!(tsk->flags & PF_EXITING))
    	return 0;		<- All good

    if (tsk->flags & PF_EXITPIDONE)
        return -EOWNERDEAD;	<- Locker can take over

    mutex_lock(&tsk->futex_exit_mutex);
    if (tsk->flags & PF_EXITPIDONE) {
        mutex_unlock(&tsk->futex_exit_mutex);
        return -EOWNERDEAD;	<- Locker can take over
    }

    queue_futex();
    mutex_unlock(&tsk->futex_exit_mutex);

Not that I think it's pretty, but it plugs all holes AFAICT.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ