linux-kernel - [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87o97q1cky.fsf_-_@xmission.com>
Date:   Tue, 05 Feb 2019 09:26:05 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        jolsa@...hat.com, Namhyung Kim <namhyung@...nel.org>,
        luca abeni <luca.abeni@...tannapisa.it>,
        syzkaller <syzkaller@...glegroups.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: [RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending


Recently syzkaller was able to create unkillablle processes by
creating a timer that is delivered as a thread local signal on SIGHUP,
and receiving SIGHUP SA_NODEFERER.  Ultimately causing a loop
failing to deliver SIGHUP but always trying.

Upon examination it turns out part of the problem is actually most of
the solution.  Since 2.5 complete_signal has found all fatal signals
and queued SIGKILL in every threads thread queue relying on
signal->group_exit_code to preserve the information of which was the
actual fatal signal.

The conversion of all fatal signals to SIGKILL results in the
synchronous signal heuristic in next_signal kicking in and preferring
SIGHUP to SIGKILL.  Which is especially problematic as all
fatal signals have already been transformed into SIGKILL.

Now that we have task->jobctl we can do better and set a bit in
task->jobctl instead of reusing tsk->pending[SIGKILL].  This allows
giving already detected process exits a higher priority than any
pending signal.

Cc: stable@...r.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@...gle.com>
Ref: ebf5ebe31d2c ("[PATCH] signal-fixes-2.5.59-A4")
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
---
Oleg, can you give this a quick review and see if I am missing anything?
Dmitry, can you verify this runs cleanly in your test setups?

 fs/coredump.c                |  2 +-
 include/linux/sched/jobctl.h |  1 +
 include/linux/sched/signal.h |  2 +-
 kernel/signal.c              | 10 ++++++++--
 4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/coredump.c b/fs/coredump.c
index e42e17e55bfd..487995293ef0 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -322,7 +322,7 @@ static int zap_process(struct task_struct *start, int exit_code, int flags)
 	for_each_thread(start, t) {
 		task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
 		if (t != current && t->mm) {
-			sigaddset(&t->pending.signal, SIGKILL);
+			t->jobctl |= JOBCTL_TASK_EXIT;
 			signal_wake_up(t, 1);
 			nr++;
 		}
diff --git a/include/linux/sched/jobctl.h b/include/linux/sched/jobctl.h
index 98228bd48aee..ff7b3ea43f4c 100644
--- a/include/linux/sched/jobctl.h
+++ b/include/linux/sched/jobctl.h
@@ -18,6 +18,7 @@ struct task_struct;
 #define JOBCTL_TRAP_NOTIFY_BIT	20	/* trap for NOTIFY */
 #define JOBCTL_TRAPPING_BIT	21	/* switching to TRACED */
 #define JOBCTL_LISTENING_BIT	22	/* ptracer is listening for events */
+#define JOBCTL_TASK_EXIT	23	/* task is exiting */
 
 #define JOBCTL_STOP_DEQUEUED	(1UL << JOBCTL_STOP_DEQUEUED_BIT)
 #define JOBCTL_STOP_PENDING	(1UL << JOBCTL_STOP_PENDING_BIT)
diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 13789d10a50e..3f3edadf1ae1 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -354,7 +354,7 @@ static inline int signal_pending(struct task_struct *p)
 
 static inline int __fatal_signal_pending(struct task_struct *p)
 {
-	return unlikely(sigismember(&p->pending.signal, SIGKILL));
+	return unlikely(p->jobctl & JOBCTL_TASK_EXIT);
 }
 
 static inline int fatal_signal_pending(struct task_struct *p)
diff --git a/kernel/signal.c b/kernel/signal.c
index 9ca8e5278c8e..0577e37fdf43 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -989,7 +989,7 @@ static void complete_signal(int sig, struct task_struct *p, enum pid_type type)
 			t = p;
 			do {
 				task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
-				sigaddset(&t->pending.signal, SIGKILL);
+				t->jobctl |= JOBCTL_TASK_EXIT;
 				signal_wake_up(t, 1);
 			} while_each_thread(p, t);
 			return;
@@ -1273,7 +1273,7 @@ int zap_other_threads(struct task_struct *p)
 		/* Don't bother with already dead threads */
 		if (t->exit_state)
 			continue;
-		sigaddset(&t->pending.signal, SIGKILL);
+		t->jobctl |= JOBCTL_TASK_EXIT;
 		signal_wake_up(t, 1);
 	}
 
@@ -2393,6 +2393,11 @@ bool get_signal(struct ksignal *ksig)
 		goto relock;
 	}
 
+	/* Has this task already been flagged for death? */
+	ksig->info.si_signo = signr = SIGKILL;
+	if (current->jobctl & JOBCTL_TASK_EXIT)
+		goto fatal;
+
 	for (;;) {
 		struct k_sigaction *ka;
 
@@ -2488,6 +2493,7 @@ bool get_signal(struct ksignal *ksig)
 			continue;
 		}
 
+	fatal:
 		spin_unlock_irq(&sighand->siglock);
 
 		/*
-- 
2.17.1