linux-kernel - SIGTRAP vs. sys_exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <48EA1BE9.1030707@siemens.com>
Date:	Mon, 06 Oct 2008 16:08:41 +0200
From:	Jan Kiszka <jan.kiszka@...mens.com>
To:	Roland McGrath <roland@...hat.com>, Oleg Nesterov <oleg@...sign.ru>
CC:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: SIGTRAP vs. sys_exit_group race

Hi,

are there any news on these ideas?

http://marc.info/?l=linux-kernel&m=121540671602971

I've been caught by a race between a ptraced thread running on a
breakpoint, thus generating a SIGTRAP and another thread in this process
issuing sys_exit_group. The discussion above, specifically Oleg's
concerns, made me think that this is a generic issue of current
mainline.

I observed this on a heavily patched 2.6.26.5 kernel which comes, among
other things, with a higher probability for latencies/reschedules
between the queuing of SIGTRAP and the actual delivery. Right into this
window, the sys_exit_group comes. It informs gdb about the termination,
sends out SIGKILL to the other threads and turns the caller into a
zombie. Now the second thread has SIGKILL + SIGTRAP pending, and it
picks SIGTRAP for delivery. At this point gdb gets confused (maybe a bug
of its own?), sends SIGSTOP to the dead thread and waits for it to enter
the traced state (which it will never do) - deadlock of gdb, only
resolvable by killing the latter.

The patch below (rebased against latest git) resolves the issue for me,
but I'm definitely not sure about all its implications and if I'm not
papering over a different issue. Could you comment on my scenario? Is it
possible with mainline as well? Will Roland's approach resolve it?

Thanks,
Jan

---
Index: b/kernel/signal.c
===================================================================
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1528,10 +1528,11 @@ static void ptrace_stop(int exit_code, i
 		spin_unlock_irq(&current->sighand->siglock);
 		arch_ptrace_stop(exit_code, info);
 		spin_lock_irq(&current->sighand->siglock);
-		if (sigkill_pending(current))
-			return;
 	}

+	if (sigkill_pending(current))
+		return;
+
 	/*
 	 * If there is a group stop in progress,
 	 * we must participate in the bookkeeping.
-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/