[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120129160711.GA20803@redhat.com>
Date: Sun, 29 Jan 2012 17:07:11 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: mingo@...hat.com, hpa@...or.com, linux-kernel@...r.kernel.org,
a.p.zijlstra@...llo.nl, y-goto@...fujitsu.com,
akpm@...ux-foundation.org, tglx@...utronix.de, mingo@...e.hu,
linux-tip-commits@...r.kernel.org
Subject: Re: [tip:sched/core] sched: Fix ancient race in do_exit()
On 01/28, Linus Torvalds wrote:
>
> On Sat, Jan 28, 2012 at 4:03 AM, tip-bot for Yasunori Goto
> <y-goto@...fujitsu.com> wrote:
> >
> > sched: Fix ancient race in do_exit()
>
> Ugh.
>
> It would be much nicer to just clear the rwsem waiter->task thing
> *after* waking the task up, which would avoid this race entirely,
> afaik.
How? The problem is that wake_up_process(tsk) sees this task in
TASK_UNINTERRUPTIBLE state (the first "p->state & state" check in
try_to_wake_up), but then this task changes its state to TASK_DEAD
without schedule() and ttwu() does s/TASK_DEAD/TASK_RUNNING/.
IOW, the task doing
current->state = TASK_A;
...
current->state = TASK_B;
schedule();
can be woken up by try_to_wake_up(TASK_A), despite the fact it
sleeps in TASK_B. do_exit() is only "special" because it is not
easy to handle the spurious wakeup.
> Tell me, why wouldn't that work? rwsem_down_failed_common() does
>
> /* wait to be given the lock */
> for (;;) {
> if (!waiter.task)
> break;
> ...
>
> so then we wouldn't need the task refcount crap in rwsem either etc,
> and we'd get rid of all races with wakeup.
>
> I wonder why we're clearing that whole waiter->task so early.
I must have missed something. I can't understand how this can help,
and "clear the rwsem waiter->task thing *after* waking" looks
obviously wrong. If we do this, then we can miss the "!!waiter.task"
condition. The loop above actually does
set_task_state(TASK_UNINTERRUPTIBLE);
if (!waiter.task)
break;
schedule();
and
wake_up_process(tsk);
waiter->task = NULL;
can happen right after set_task_state().
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists