lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190903160036.2400-4-mathieu.desnoyers@efficios.com>
Date:   Tue,  3 Sep 2019 12:00:36 -0400
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Chris Metcalf <cmetcalf@...hip.com>,
        Christoph Lameter <cl@...ux.com>,
        Kirill Tkhai <tkhai@...dex.ru>, Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: [RFC PATCH 3/3] Fix: sched/membarrier: use probe_kernel_address to read mm->membarrier_state

membarrier_global_expedited reads each runqueue current task's
mm->membarrier_state to test a flag. A concurrent task exit can cause
the memory backing the struct mm to have been freed before that read.
Therefore, use probe_kernel_address to read that flag. If the read
fails, consider it as if the flag was unset: it means the scheduler code
provides the barriers required by membarrier, and sending an IPI to this
CPU is redundant.

There is ongoing discussion on removing the need to use
probe_kernel_address from this code by providing additional existence
guarantees for struct mm. This patch is submitted as a fix aiming to be
backported to prior stable kernel releases.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Russell King - ARM Linux admin <linux@...linux.org.uk>
Cc: Chris Metcalf <cmetcalf@...hip.com>
Cc: Christoph Lameter <cl@...ux.com>
Cc: Kirill Tkhai <tkhai@...dex.ru>
Cc: Mike Galbraith <efault@....de>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...nel.org>
---
 kernel/sched/membarrier.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index 02feb7c8da4f..0312604d8315 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -58,6 +58,8 @@ static int membarrier_global_expedited(void)
 	cpus_read_lock();
 	for_each_online_cpu(cpu) {
 		struct task_struct *p;
+		struct mm_struct *mm;
+		int membarrier_state;
 
 		/*
 		 * Skipping the current CPU is OK even through we can be
@@ -72,17 +74,34 @@ static int membarrier_global_expedited(void)
 
 		rcu_read_lock();
 		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
-		if (p) {
-			struct mm_struct *mm = READ_ONCE(p->mm);
+		if (!p)
+			goto next;
+		mm = READ_ONCE(p->mm);
+		if (!mm)
+			goto next;
 
-			if (mm && (atomic_read(&mm->membarrier_state) &
-				   MEMBARRIER_STATE_GLOBAL_EXPEDITED)) {
-				if (!fallback)
-					__cpumask_set_cpu(cpu, tmpmask);
-				else
-					smp_call_function_single(cpu, ipi_mb, NULL, 1);
-			}
+		/*
+		 * Using probe_kernel_address() of membarrier_state instead of
+		 * an atomic_read() to deal with the fact that mm may have been
+		 * concurrently freed. If probe_kernel_address fails, it means
+		 * the mm struct was freed, so there is no need to issue a
+		 * barrier on that particular CPU, because the scheduler code
+		 * is taking care of it.
+		 *
+		 * It does not matter whether probe_kernel_address decides to
+		 * read membarrier_state piece-wise, given that we only care
+		 * about testing a single bit.
+		 */
+		if (probe_kernel_address(&mm->membarrier_state.counter,
+					 membarrier_state))
+			membarrier_state = 0;
+		if (membarrier_state & MEMBARRIER_STATE_GLOBAL_EXPEDITED) {
+			if (!fallback)
+				__cpumask_set_cpu(cpu, tmpmask);
+			else
+				smp_call_function_single(cpu, ipi_mb, NULL, 1);
 		}
+	next:
 		rcu_read_unlock();
 	}
 	if (!fallback) {
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ