lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 24 Jan 2013 19:50:26 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Yasunori Goto <y-goto@...fujitsu.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>,
	Dan Carpenter <dan.carpenter@...cle.com>,
	Michael Davidson <md@...gle.com>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Julien Tinnes <jln@...gle.com>,
	Aaron Durbin <adurbin@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Roland McGrath <roland@...k.frob.com>,
	Alexander Gordeev <agordeev@...hat.com>
Subject: Re: TASK_DEAD && ttwu() again (Was: ensure
	arch_ptrace/ptrace_request can never race with SIGKILL)

Hi Tejun,

On 01/23, Tejun Heo wrote:
> Hello, Oleg.
>
> On Wed, Jan 23, 2013 at 08:19:46PM +0100, Oleg Nesterov wrote:
> >
> > And this means that a task doing
> >
> > 	current->state = STATE_1;
> > 	// no schedule() in between
> > 	current->state = STATE_2;
> > 	schedule();
> >
> > can be actually woken up by try_to_wake_up(STATE_1) even if it already
> > sleeps in STATE_2.
>
> Hmmm... nasty.
>
> ...
> > and we have the same problem again. So _I think_ that we we need another
> > mb() after unlock_wait() ?
>
> Seems so, or, maybe we should add barrier semantics to unlock_wait()?
> As it currently stands, it kinda invites misusages.

Perhaps, I am not sure. I agree, unlock_wait() without a barrier at
least on one side probably never makes sense.

> > And, afaics, in theory we can't simply move the current mb() down, after
> > unlock_wait().  (again, only in theory, if nothing else we should have
> > the implicit barrrers after we played with ->state in the past).
> >
> > Or perhaps we should modify ttwu_do_wakeup() to not blindly set RUNNING,
> > say, cmpxchg(old_state, RUNNING). But this is not simple/nice.
>
> I personally think this is the right thing to do short of requiring
> locking on current->state changes.  The situation is a bit muddy
> because we're generally requiring sleepers to loop while still having
> cases where things don't work that way.  It's a little scary that we
> require looping to protect against stray wakeups, which can be very
> rare, without any way to verify/test.
>
> The waker would be acquiring the cacheline exclusively one way or the
> other, so I don't think doing cmpxchg would add much overhead.  We
> would definitely want to do comparisons tho.

Still cmpxchg() is not free, Peter argued that ttwu() is the very hot
path and I understand his concern. And in fact this change doesn't look
very simple...

And this can not eliminate the spurious wakeups in general. Plus in
this particular case (I mean the race described by the comment above
spin_unlock_wait in do_exit) we could equally blame rw_semaphore, not
try_to_wake_up().

So I leave this to scheduler people.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ