linux-kernel - Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140901175851.GA15210@redhat.com>
Date:	Mon, 1 Sep 2014 19:58:51 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Kautuk Consul <consul.kautuk@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michal Hocko <mhocko@...e.cz>,
	David Rientjes <rientjes@...gle.com>,
	Ionut Alexa <ionut.m.alexa@...il.com>,
	Guillaume Morin <guillaume@...infr.org>,
	linux-kernel@...r.kernel.org, Kirill Tkhai <tkhai@...dex.ru>
Subject: Re: [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race
	with try_to_wake_up()

On 09/01, Peter Zijlstra wrote:
>
> On Mon, Aug 25, 2014 at 05:57:38PM +0200, Oleg Nesterov wrote:
> > Peter, do you remember another problem with TASK_DEAD we discussed recently?
> > (prev_state == TASK_DEAD detection in finish_task_switch() still looks racy).
>
> Uhm, right. That was somewhere on the todo list :-)
>
> > I am starting to think that perhaps we need something like below, what do
> > you all think?
>
> I'm thinking you lost the hunk that adds rq::dead :-), more comments
> below.

And "goto deactivate" should be moved down, after "switch_count"
initialization.

> > +	if (unlikely(rq->dead))
> > +		goto deactivate;
> > +
>
> Yeah, it would be best to not have to do that; ideally we would be able
> to maybe do both; set rq->dead and current->state == TASK_DEAD.

To avoid spin_unlock_wait() in do_exit(). But on a second thought this
can't work, please see below.

> > --- x/kernel/exit.c
> > +++ x/kernel/exit.c
> > @@ -815,25 +815,8 @@ void do_exit(long code)
> >  		__this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
> >  	exit_rcu();
> >
> > -	/*
> > -	 * The setting of TASK_RUNNING by try_to_wake_up() may be delayed
> > -	 * when the following two conditions become true.
> > -	 *   - There is race condition of mmap_sem (It is acquired by
> > -	 *     exit_mm()), and
> > -	 *   - SMI occurs before setting TASK_RUNINNG.
> > -	 *     (or hypervisor of virtual machine switches to other guest)
> > -	 *  As a result, we may become TASK_RUNNING after becoming TASK_DEAD
> > -	 *
> > -	 * To avoid it, we have to wait for releasing tsk->pi_lock which
> > -	 * is held by try_to_wake_up()
> > -	 */
> > -	smp_mb();
> > -	raw_spin_unlock_wait(&tsk->pi_lock);
> > -
> > -	/* causes final put_task_struct in finish_task_switch(). */
> > -	tsk->state = TASK_DEAD;
> >  	tsk->flags |= PF_NOFREEZE;	/* tell freezer to ignore us */
> > -	schedule();
> > +	exit_schedule();
> >  	BUG();
> >  	/* Avoid "noreturn function does return".  */
> >  	for (;;)
>
> Yes, something like this might work fine..

Not really :/ Yes, rq->dead (or just "bool prev_dead") should obviously
solve the problem with ttwu() after the last schedule(). But only in a
sense that the dying task won't be activated.

However, the very fact that another CPU can look at this task_struct
means that we still need spin_unlock_wait(). If nothing else to ensure
that try_to_wake_up()->spin_unlock(pi_lock) won't write into the memory
we are are going to free.

So I think the comment in do exit should be updated too, and smp_mb()
should be moved under raw_spin_unlock_wait() but ...

But. If am right, doesn't this mean we that have even more problems with
postmortem wakeups??? Why ttwu() can't _start_ after spin_unlock_wait ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/