[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F1EE992.5000901@jp.fujitsu.com>
Date: Tue, 24 Jan 2012 12:25:38 -0500
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: peterz@...radead.org
CC: oleg@...hat.com, mingo@...e.hu, y-goto@...fujitsu.com,
tglx@...utronix.de, kamezawa.hiroyu@...fujitsu.com,
linux-kernel@...r.kernel.org
Subject: Re: [BUG] TASK_DEAD task is able to be woken up in special condition
On 1/24/2012 5:55 AM, Peter Zijlstra wrote:
> On Tue, 2012-01-24 at 11:19 +0100, Peter Zijlstra wrote:
>> On Wed, 2012-01-18 at 15:20 +0100, Oleg Nesterov wrote:
>>> do_exit() is different because it can not handle the spurious wakeup.
>>> Well, may be we can? we can simply do
>>>
>>> for (;;) {
>>> tsk->state = TASK_DEAD;
>>> schedule();
>>> }
>>>
>>> __schedule() can't race with ttwu() once it takes rq->lock. If the
>>> exiting task is deactivated, finish_task_switch() will see EXIT_DEAD.
>>
>> TASK_DEAD, right?
>>
>>> Unless I missed something, the only problem is preempt_disable(),
>>> but schedule_debug() checks ->exit_state.
>>>
>>> OTOH, if we fix this race then probably schedule_debug() should
>>> check state == EXIT_DEAD instead.
>>
>> Hmm, interesting. On the up side that removes the need for that inf loop
>> after BUG, down side is of course that we loose the BUG itself too. Now
>> I'm not too sure we actually care about that, a task spinning at 100% in
>> x state should be fairly obvious borkage and its not like we hit this
>> thing very often.
>
> Something like so, right? schedule_debug() already tests
> prev->exit_state so it should DTRT afaict.
>
> Also, while going over this again, I think Yasunori-San's patch isn't
> sufficient, note how the p->state = TASK_RUNNING in ttwu_do_wakeup() can
> happen outside of p->pi_lock when the task gets queued on a remote cpu.
>
> ---
> kernel/exit.c | 17 +++++++++++------
> 1 files changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 294b170..ccd4f84 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1039,13 +1039,18 @@ void do_exit(long code)
> __this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
> exit_rcu();
> /* causes final put_task_struct in finish_task_switch(). */
> - tsk->state = TASK_DEAD;
> tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */
> - schedule();
> - BUG();
> - /* Avoid "noreturn function does return". */
> - for (;;)
> - cpu_relax(); /* For when BUG is null */
> + for (;;) {
> + /*
> + * A spurious wakeup, eg. generated by rwsem when down()'s call
> + * to schedule() doesn't happen but the wakeup from the
> + * previous owner's up() did, can stomp on our ->state.
> + *
> + * This loop also avoids "noreturn functions does return"
> + */
> + tsk->state = TASK_DEAD;
> + schedule();
> + }
> }
This looks ok to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists