linux-kernel - [PATCH v3 1/1] workqueue: ignore dead tasks in a workqueue sleep hook

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20161025110357.8821-1-roman.penyaev@profitbricks.com>
Date:   Tue, 25 Oct 2016 13:03:57 +0200
From:   Roman Pen <roman.penyaev@...fitbricks.com>
To:     unlisted-recipients:; (no To-header on input)
Cc:     Roman Pen <roman.penyaev@...fitbricks.com>,
        Andy Lutomirski <luto@...nel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Tejun Heo <tj@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: [PATCH v3 1/1] workqueue: ignore dead tasks in a workqueue sleep hook

If panic_on_oops is not set and oops happens inside workqueue kthread,
kernel kills this kthread.  Current patch fixes recursive GPF which
happens when wq_worker_sleeping() function unconditionally accesses
the NULL kthread->vfork_done ptr thru kthread_data() -> to_kthread().

The stack is the following:

[<ffffffff81397f75>] dump_stack+0x68/0x93
[<ffffffff8106954b>] ? do_exit+0x7ab/0xc10
[<ffffffff8108fd73>] __schedule_bug+0x83/0xe0
[<ffffffff81716d5a>] __schedule+0x7ea/0xba0
[<ffffffff810c864f>] ? vprintk_default+0x1f/0x30
[<ffffffff8116a63c>] ? printk+0x48/0x50
[<ffffffff81717150>] schedule+0x40/0x90
[<ffffffff8106976a>] do_exit+0x9ca/0xc10
[<ffffffff810c8e3d>] ? kmsg_dump+0x11d/0x190
[<ffffffff810c8d37>] ? kmsg_dump+0x17/0x190
[<ffffffff81021ee9>] oops_end+0x99/0xd0
[<ffffffff81052da5>] no_context+0x185/0x3e0
[<ffffffff81053083>] __bad_area_nosemaphore+0x83/0x1c0
[<ffffffff810c820e>] ? vprintk_emit+0x25e/0x530
[<ffffffff810531d4>] bad_area_nosemaphore+0x14/0x20
[<ffffffff8105355c>] __do_page_fault+0xac/0x570
[<ffffffff810c66fe>] ? console_trylock+0x1e/0xe0
[<ffffffff81002036>] ? trace_hardirqs_off_thunk+0x1a/0x1c
[<ffffffff81053a2c>] do_page_fault+0xc/0x10
[<ffffffff8171f812>] page_fault+0x22/0x30
[<ffffffff81089bc3>] ? kthread_data+0x33/0x40
[<ffffffff8108427e>] ? wq_worker_sleeping+0xe/0x80
[<ffffffff817169eb>] __schedule+0x47b/0xba0
[<ffffffff81717150>] schedule+0x40/0x90
[<ffffffff8106957d>] do_exit+0x7dd/0xc10
[<ffffffff81021ee9>] oops_end+0x99/0xd0

kthread->vfork_done is zeroed out on the following path:

    do_exit()
    exit_mm()
    mm_release()
    complete_vfork_done()

In order to fix a bug dead tasks must be ignored.

Signed-off-by: Roman Pen <roman.penyaev@...fitbricks.com>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org
---
v3:
  o minor comment and coding style fixes.

v2:
  o put a task->state check directly into a wq_worker_sleeping() function
    instead of changing the __schedule().

 kernel/workqueue.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9dc7ac5101e0..1525456b18c6 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -875,9 +875,33 @@ void wq_worker_waking_up(struct task_struct *task, int cpu)
  */
 struct task_struct *wq_worker_sleeping(struct task_struct *task)
 {
-	struct worker *worker = kthread_data(task), *to_wakeup = NULL;
+	struct worker *worker, *to_wakeup = NULL;
 	struct worker_pool *pool;
 
+
+	if (task->state == TASK_DEAD) {
+		/*
+		 * Here we try to catch the following path before
+		 * accessing NULL kthread->vfork_done ptr thru
+		 * kthread_data():
+		 *
+		 *    oops_end()
+		 *    do_exit()
+		 *    schedule()
+		 *
+		 * If panic_on_oops is not set and oops happens on
+		 * a workqueue execution path, thread will be killed.
+		 * That is definitly sad, but not to make the situation
+		 * even worse we have to ignore dead tasks in order not
+		 * to step on zeroed out members (e.g. t->vfork_done is
+		 * already NULL on that path, since we were called by
+		 * do_exit())).
+		 */
+		return NULL;
+	}
+
+	worker = kthread_data(task);
+
 	/*
 	 * Rescuers, which may not have all the fields set up like normal
 	 * workers, also reach here, let's not access anything before
-- 
2.9.3