linux-kernel - [PATCH 1/4 v3] call_function_many: fix list delete vs add race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <smp-call-function-list-race-fix-v3@mdm.bga.com>
Date:	Tue, 15 Mar 2011 13:27:16 -0600
From:	Milton Miller <miltonm@....com>
To:	Peter Zijlstra <peterz@...radead.org>, akpm@...ux-foundation.org
Cc:	Anton Blanchard <anton@...ba.org>, xiaoguangrong@...fujitsu.com,
	mingo@...e.hu, jaxboe@...ionio.com, npiggin@...il.com,
	rusty@...tcorp.com.au, efault@....de,
	Jan Beulich <JBeulich@...ell.com>,
	Dimitri Sivanich <sivanich@....com>,
	Tony Luck <tony.luck@...el.com>, torvalds@...ux-foundation.org,
	paulmck@...ux.vnet.ibm.com, benh@...nel.crashing.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH 1/4 v3] call_function_many: fix list delete vs add race

Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.   Fix this by not setting refs until we
have gotten the lock for the list.  Take advantage of the wmb in
list_add_rcu to save an explicit additional one.

I tried to force this race with a udelay before the lock & list_add
and by mixing all 64 online cpus with just 3 random cpus in the mask,
but was unsuccessful.  Still, inspection shows a valid race, and the
fix is a extension of the existing protection window in the current code.

Cc: stable (v2.6.32 and later)
Reported-by: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Milton Miller <miltonm@....com>
---
v2: rely on wmb in list_add_rcu not combined partial ordering of spin
lock and unlock, which does not provide the needed guarantees.

Index: linux-2.6/kernel/smp.c
===================================================================
--- linux-2.6.orig/kernel/smp.c	2011-01-31 17:44:47.182756513 -0600
+++ linux-2.6/kernel/smp.c	2011-01-31 18:25:47.266755387 -0600
@@ -491,14 +491,15 @@ void smp_call_function_many(const struct
 	cpumask_clear_cpu(this_cpu, data->cpumask);
 
 	/*
-	 * To ensure the interrupt handler gets an complete view
-	 * we order the cpumask and refs writes and order the read
-	 * of them in the interrupt handler.  In addition we may
-	 * only clear our own cpu bit from the mask.
+	 * We reuse the call function data without waiting for any grace
+	 * period after some other cpu removes it from the global queue.
+	 * This means a cpu might find our data block as it is writen.
+	 * The interrupt handler waits until it sees refs filled out
+	 * while its cpu mask bit is set; here we may only clear our
+	 * own cpu mask bit, and must wait to set refs until we are sure
+	 * previous writes are complete and we have obtained the lock to
+	 * add the element to the queue.
 	 */
-	smp_wmb();
-
-	atomic_set(&data->refs, cpumask_weight(data->cpumask));
 
 	raw_spin_lock_irqsave(&call_function.lock, flags);
 	/*
@@ -507,6 +508,11 @@ void smp_call_function_many(const struct
 	 * will not miss any other list entries:
 	 */
 	list_add_rcu(&data->csd.list, &call_function.queue);
+	/*
+	 * We rely on the wmb() in list_add_rcu to order the writes
+	 * to func, data, and cpumask before this write to refs.
+	 */
+	atomic_set(&data->refs, cpumask_weight(data->cpumask));
 	raw_spin_unlock_irqrestore(&call_function.lock, flags);
 
 	/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/