linux-kernel - [PATCH UPDATED 3/8] job control: Fix ptracer wait(2) hang and explain notask

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110322105128.GP12003@htj.dyndns.org>
Date:	Tue, 22 Mar 2011 11:51:28 +0100
From:	Tejun Heo <tj@...nel.org>
To:	oleg@...hat.com, roland@...hat.com, jan.kratochvil@...hat.com,
	vda.linux@...glemail.com
Cc:	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
	akpm@...ux-foundation.org, indan@....nu
Subject: [PATCH UPDATED 3/8] job control: Fix ptracer wait(2) hang and
 explain notask_error clearing

wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.

However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.

Fix it by clearing notask_error only if WCONTINUED | WEXITED is set
for ptraced zombies.  While at it, document why clearing notask_error
is safe for each case.

Test case follows.

  #include <stdio.h>
  #include <unistd.h>
  #include <pthread.h>
  #include <time.h>
  #include <sys/types.h>
  #include <sys/ptrace.h>
  #include <sys/wait.h>

  static void *nooper(void *arg)
  {
	  pause();
	  return NULL;
  }

  int main(void)
  {
	  const struct timespec ts1s = { .tv_sec = 1 };
	  pid_t tracee, tracer;
	  siginfo_t si;

	  tracee = fork();
	  if (tracee == 0) {
		  pthread_t thr;

		  pthread_create(&thr, NULL, nooper, NULL);
		  nanosleep(&ts1s, NULL);
		  printf("tracee exiting\n");
		  pthread_exit(NULL);	/* let subthread run */
	  }

	  tracer = fork();
	  if (tracer == 0) {
		  ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
		  while (1) {
			  if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
				  perror("waitid");
				  break;
			  }
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(long)si.si_status);
		  }
		  return 0;
	  }

	  waitid(P_PID, tracer, &si, WEXITED);
	  kill(tracee, SIGKILL);
	  return 0;
  }

Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.

  tracee exiting
  ^C

After the patch, tracee exiting triggers waitid() to fail.

  tracee exiting
  waitid: No child processes

-v2: Oleg pointed out that exited in addition to continued can happen
     for ptraced dead group leader.  Clear notask_error for ptraced
     child on WEXITED too.

Signed-off-by: Tejun Heo <tj@...nel.org>
Cc: Oleg Nesterov <oleg@...hat.com>
---
WEXITED bug fixed.  Let's tackle the ptraced per-task wait(2) thing
later.  Thanks.

 kernel/exit.c |   44 ++++++++++++++++++++++++++++++++++----------
 1 file changed, 34 insertions(+), 10 deletions(-)

Index: work/kernel/exit.c
===================================================================
--- work.orig/kernel/exit.c
+++ work/kernel/exit.c
@@ -1550,17 +1550,41 @@ static int wait_consider_task(struct wai
 		return 0;
 	}
 
-	/*
-	 * We don't reap group leaders with subthreads.
-	 */
-	if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p))
-		return wait_task_zombie(wo, p);
+	/* slay zombie? */
+	if (p->exit_state == EXIT_ZOMBIE) {
+		/* we don't reap group leaders with subthreads */
+		if (!delay_group_leader(p))
+			return wait_task_zombie(wo, p);
 
-	/*
-	 * It's stopped or running now, so it might
-	 * later continue, exit, or stop again.
-	 */
-	wo->notask_error = 0;
+		/*
+		 * Allow access to stopped/continued state via zombie by
+		 * falling through.  Clearing of notask_error is complex.
+		 *
+		 * When !@...ace:
+		 *
+		 * If WEXITED is set, notask_error should naturally be
+		 * cleared.  If not, subset of WSTOPPED|WCONTINUED is set,
+		 * so, if there are live subthreads, there are events to
+		 * wait for.  If all subthreads are dead, it's still safe
+		 * to clear - this function will be called again in finite
+		 * amount time once all the subthreads are released and
+		 * will then return without clearing.
+		 *
+		 * When @ptrace:
+		 *
+		 * Stopped state is per-task and thus can't change once the
+		 * target task dies.  Only continued and exited can happen.
+		 * Clear notask_error if WCONTINUED | WEXITED.
+		 */
+		if (likely(!ptrace) || (wo->wo_flags & (WCONTINUED | WEXITED)))
+			wo->notask_error = 0;
+	} else {
+		/*
+		 * @p is alive and it's gonna stop, continue or exit, so
+		 * there always is something to wait for.
+		 */
+		wo->notask_error = 0;
+	}
 
 	if (task_stopped_code(p, ptrace))
 		return wait_task_stopped(wo, ptrace, p);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/