lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071128012129.GD6840@v2.random>
Date:	Wed, 28 Nov 2007 02:21:29 +0100
From:	Andrea Arcangeli <andrea@...e.de>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, jack@...e.cz,
	Ingo Molnar <mingo@...e.hu>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Alexey Dobriyan <adobriyan@...il.com>
Subject: Re: /proc dcache deadlock in do_exit

On Tue, Nov 27, 2007 at 02:38:52PM -0800, Andrew Morton wrote:
> I don't see why the schedule() will not return?  Because the task has
> PF_EXITING set?  Doesn't TASK_DEAD do that?

Ouch, I assumed you couldn't sleep safely anymore in release_task
given it's the function that will free the task structure itself and
there was no preempt related action anywhere close to it!
delayed_put_task_struct can be called if a quiescent point is reached
and any scheduling would exactly allow it to run (it requires quite a
bit of a race, with local irq triggering a reschedule and the timer
irq invoking the tasklet to run to free the task struct before do_exit
finishes and all other cpus in quiescent state too).

So a corollary question is how can it be safe to call
preempt_disable() after call_rcu(delayed_put_task_struct)?

Back in sles9 preempt_disable was implemented as
_raw_write_unlock(&tasklist_lock) and it happened _before_
release_task, and scheduling there wouldn't return because PF_DEAD was
already set. If mainline can come back, it will crash for a different
reason because the task struct is long gone by the time
release_task+schedule() runs. Either ways, still a kernel crashing bug
there is. Or is there some magic that prevents call_rcu + schedule to
invoke the rcu callback?

So you may need to apply this one too (this one is needed to fix the
second bug, my previous patch is needed after applying this one):

Signed-off-by: Andrea Arcangeli <andrea@...e.de>

diff --git a/kernel/exit.c b/kernel/exit.c
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -841,6 +841,13 @@ static void exit_notify(struct task_stru
 
 	write_unlock_irq(&tasklist_lock);
 
+	/*
+	 * Task struct can go away at the first schedule if this was a
+	 * self reaping task. Scheduling is forbidden until we set
+	 * the state to TASK_DEAD.
+	 */
+	preempt_disable();
+
 	/* If the process is dead, release it - nobody will wait for it */
 	if (state == EXIT_DEAD)
 		release_task(tsk);
@@ -1042,7 +1049,6 @@ fastcall NORET_TYPE void do_exit(long co
 	if (tsk->splice_pipe)
 		__free_pipe_info(tsk->splice_pipe);
 
-	preempt_disable();
 	/* causes final put_task_struct in finish_task_switch(). */
 	tsk->state = TASK_DEAD;
 

> What are the implications of not running shrink_dcache_parent() on
> the exit path sometimes?  We'll leave procfs stuff behind?  Will
> they be reaped by memory pressure later on?

Yes.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ