linux-kernel - Re: TASK_DEAD && ttwu() again (Was: ensure arch_ptrace/ptrace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130123195059.GH2373@mtj.dyndns.org>
Date:	Wed, 23 Jan 2013 11:50:59 -0800
From:	Tejun Heo <tj@...nel.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Yasunori Goto <y-goto@...fujitsu.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Dan Carpenter <dan.carpenter@...cle.com>,
	Michael Davidson <md@...gle.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Julien Tinnes <jln@...gle.com>,
	Aaron Durbin <adurbin@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Roland McGrath <roland@...k.frob.com>,
	Alexander Gordeev <agordeev@...hat.com>
Subject: Re: TASK_DEAD && ttwu() again (Was: ensure
 arch_ptrace/ptrace_request can never race with SIGKILL)

Hello, Oleg.

On Wed, Jan 23, 2013 at 08:19:46PM +0100, Oleg Nesterov wrote:
> Let me remind the problem. To oversimplify, we have
> 
> 	try_to_wake_up(task, state)
> 	{
> 		lock(task->pi_lock);
> 
> 		if (task->state & state)
> 			task->state = TASK_RUNNING;
> 
> 		unlock(task->pi_lock);
> 	}
> 
> And this means that a task doing
> 
> 	current->state = STATE_1;
> 	// no schedule() in between
> 	current->state = STATE_2;
> 	schedule();
> 
> can be actually woken up by try_to_wake_up(STATE_1) even if it already
> sleeps in STATE_2.

Hmmm... nasty.

...
> and we have the same problem again. So _I think_ that we we need another
> mb() after unlock_wait() ?

Seems so, or, maybe we should add barrier semantics to unlock_wait()?
As it currently stands, it kinda invites misusages.

> And, afaics, in theory we can't simply move the current mb() down, after
> unlock_wait().  (again, only in theory, if nothing else we should have
> the implicit barrrers after we played with ->state in the past).
> 
> Or perhaps we should modify ttwu_do_wakeup() to not blindly set RUNNING,
> say, cmpxchg(old_state, RUNNING). But this is not simple/nice.

I personally think this is the right thing to do short of requiring
locking on current->state changes.  The situation is a bit muddy
because we're generally requiring sleepers to loop while still having
cases where things don't work that way.  It's a little scary that we
require looping to protect against stray wakeups, which can be very
rare, without any way to verify/test.

The waker would be acquiring the cacheline exclusively one way or the
other, so I don't think doing cmpxchg would add much overhead.  We
would definitely want to do comparisons tho.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/