linux-kernel - Re: main thread pthread_exit/sys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090209033335.5DE77FC330@magilla.sf.frob.com>
Date:	Sun,  8 Feb 2009 19:33:35 -0800 (PST)
From:	Roland McGrath <roland@...hat.com>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Kaz Kylheku <kkylheku@...il.com>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ulrich Drepper <drepper@...hat.com>
Subject: Re: main thread pthread_exit/sys_exit bug!

> I don't know about other problems with the zombie leaders.

Ok, then we can just concentrate on the test case I posted.

> Except, I am worried whether the fix I have in mind is correct ;)
> It is simple, wait_task_stopped() should do

I think the first problem is we'll never even get into wait_task_stopped().
We'll be in wait_consider_task() on the group leader, which is EXIT_ZOMBIE.
First we need to adjust this:

	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
		return wait_task_zombie(p, options, infop, stat_addr, ru);

to maybe:

	if (p->exit_state == EXIT_ZOMBIE) {
		if (delay_group_leader(p))
			return wait_task_zombie_leader(p, options,
						       infop, stat_addr, ru);
		return wait_task_zombie(p, options, infop, stat_addr, ru);
	}

In wait_task_zombie_leader(), it will have to take the siglock and try to
figure out if there is a completed group stop to report.

> 	if we tracer:
> 
> 		check ->state, eat ->exit_code

Being the ptracer does not affect the delay_group_leader logic.
It just affects individual vs group stop reports.  So the existing
code path is right for ptrace.

> 	else:
> 		check SIGNAL_STOP_STOPPED, use ->group_exit_code

We don't want wait to change group_exit_code.  But we need the "reported as
stopped" tracking that wait_task_stopped() gets by clearing exit_code.  So
I think what we need is to get the zombie group_leader->exit_code to be set
to ->group_exit_code as it would have been if the leader were alive and had
participated in the group stop.

> This looks logical, and should fix the problem. But this is
> the user-visible change. For example,
> 
> 	$ sleep 100
> 	^Z
> 	[1]+  Stopped                 sleep 100
> 	$ strace -p `pidof sleep`
> 	Process 11442 attached - interrupt to quit
> 
> strace hangs in do_wait(). But after the fix strace will happily
> proceed. I can't know whether this behaviour change is bad or not.

I think this would only happen if the "reported as stopped" bookkeeping I
mentioned above were broken.  The "Stopped" line means that the shell just
did do_wait(WUNTRACED), so wait_task_stopped() cleared ->exit_code when
reporting it as stopped.  Now strace does PTRACE_ATTACH and then a wait;
it can't see a fresh wait result here because ->exit_code is still zero.

100% untested concept patch follows.


Thanks,
Roland

==========
diff --git a/kernel/exit.c b/kernel/exit.c
index f80dec3..0000000 100644  
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1437,7 +1437,8 @@ static int wait_task_stopped(int ptrace,
 	exit_code = 0;
 	spin_lock_irq(&p->sighand->siglock);
 
-	if (unlikely(!task_is_stopped_or_traced(p)))
+	if (unlikely(!task_is_stopped_or_traced(p)) &&
+	    (ptrace || p->exit_state != EXIT_ZOMBIE || !delay_group_leader(p)))
 		goto unlock_sig;
 
 	if (!ptrace && p->signal->group_stop_count > 0)
@@ -1598,9 +1599,20 @@ static int wait_consider_task(struct tas
 
 	/*
 	 * We don't reap group leaders with subthreads.
+	 * When the group leader is dead, it still serves as
+	 * a moniker for the whole group for stop and continue.
+	 * But for ptrace, stop and continue are reported per-thread.
 	 */
-	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
-		return wait_task_zombie(p, options, infop, stat_addr, ru);
+	if (p->exit_state == EXIT_ZOMBIE) {
+		if (!delay_group_leader(p))
+			return wait_task_zombie(p, options,
+						infop, stat_addr, ru);
+		*notask_error = 0;
+		if (unlikely(ptrace))
+			return 0;
+		return wait_task_stopped(p, options, infop, stat_addr, ru) ?:
+			wait_task_continued(p, options, infop, stat_addr, ru);
+	}
 
 	/*
 	 * It's stopped or running now, so it might
diff --git a/kernel/signal.c b/kernel/signal.c
index b6b3676..0000000 100644  
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1653,6 +1653,27 @@ finish_stop(int stop_count)
 }
 
 /*
+ * Complete group stop bookkeeping after decrementing sig->group_stop_count,
+ * to new value stop_count.  When it reaches zero, mark the process stopped.
+ *
+ * If the group leader is already dead, then it did not participate
+ * normally in the group stop.  But its ->exit_code stands for the whole
+ * group in do_wait() bookkeeping, so we need it to reflect the stop.
+ */
+static inline void complete_group_stop(struct task_struct *tsk,
+				       struct signal_struct *sig,
+				       int stop_count)
+{
+	if (stop_count)
+		return;
+
+	sig->flags = SIGNAL_STOP_STOPPED;
+
+	if (tsk->group_leader->exit_state)
+		tsk->group_leader->exit_code = sig->group_exit_code;
+}
+
+/*
  * This performs the stopping for SIGSTOP and other stop signals.
  * We have to stop all threads in the thread group.
  * Returns nonzero if we've actually stopped and released the siglock.
@@ -1696,8 +1717,7 @@ static int do_signal_stop(int signr)
 		sig->group_stop_count = stop_count;
 	}
 
-	if (stop_count == 0)
-		sig->flags = SIGNAL_STOP_STOPPED;
+	complete_group_stop(current, sig, stop_count);
 	current->exit_code = sig->group_exit_code;
 	__set_current_state(TASK_STOPPED);
 
@@ -1933,9 +1953,8 @@ void exit_signals(struct task_struct *ts
 		if (!signal_pending(t) && !(t->flags & PF_EXITING))
 			recalc_sigpending_and_wake(t);
 
-	if (unlikely(tsk->signal->group_stop_count) &&
-			!--tsk->signal->group_stop_count) {
-		tsk->signal->flags = SIGNAL_STOP_STOPPED;
+	if (unlikely(tsk->signal->group_stop_count)) {
+		complete_group_stop(tsk, sig, --tsk->signal->group_stop_count);
 		group_stop = 1;
 	}
 out:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/