lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1271968398.1646.16.camel@laptop>
Date:	Thu, 22 Apr 2010 22:33:18 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
	dvhltc@...ibm.com, niv@...ibm.com, tglx@...utronix.de,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com
Subject: Re: [PATCH tip/core/urgent 3/3] sched: protect
 __sched_setscheduler() access to cgroups

On Thu, 2010-04-22 at 12:54 -0700, Paul E. McKenney wrote:
> A given task's cgroups structures must remain while that task is running
> due to reference counting, so this is presumably a false positive.
> Updated to reflect feedback from Tetsuo Handa.

I think its not a false positive, I think we can race with the task
being placed in another cgroup. We don't hold task_lock() [our other
discussion] nor does it hold rq->lock [used by the sched ->attach()
method].

That said, we should probably cure the race condition of
sched_setscheduler() vs ->attach().

Something like the below perhaps?

Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
 kernel/sched.c |   38 ++++++++++++++++++++++++++------------
 1 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 95eaecc..345df67 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4425,16 +4425,6 @@ recheck:
 	}
 
 	if (user) {
-#ifdef CONFIG_RT_GROUP_SCHED
-		/*
-		 * Do not allow realtime tasks into groups that have no runtime
-		 * assigned.
-		 */
-		if (rt_bandwidth_enabled() && rt_policy(policy) &&
-				task_group(p)->rt_bandwidth.rt_runtime == 0)
-			return -EPERM;
-#endif
-
 		retval = security_task_setscheduler(p, policy, param);
 		if (retval)
 			return retval;
@@ -4450,6 +4440,28 @@ recheck:
 	 * runqueue lock must be held.
 	 */
 	rq = __task_rq_lock(p);
+	retval = 0;
+#ifdef CONFIG_RT_GROUP_SCHED
+	if (user) {
+		/*
+		 * Do not allow realtime tasks into groups that have no runtime
+		 * assigned.
+		 *
+		 * RCU read lock not strictly required but here for PROVE_RCU,
+		 * the task is pinned by holding rq->lock which avoids races
+		 * with ->attach().
+		 */
+		rcu_read_lock();
+		if (rt_bandwidth_enabled() && rt_policy(policy) &&
+				task_group(p)->rt_bandwidth.rt_runtime == 0)
+			retval = -EPERM;
+		rcu_read_unlock();
+
+		if (retval)
+			goto unlock;
+	}
+#endif
+
 	/* recheck policy now with rq lock held */
 	if (unlikely(oldpolicy != -1 && oldpolicy != p->policy)) {
 		policy = oldpolicy = -1;
@@ -4477,12 +4489,14 @@ recheck:
 
 		check_class_changed(rq, p, prev_class, oldprio, running);
 	}
+unlock:
 	__task_rq_unlock(rq);
 	raw_spin_unlock_irqrestore(&p->pi_lock, flags);
 
-	rt_mutex_adjust_pi(p);
+	if (!retval)
+		rt_mutex_adjust_pi(p);
 
-	return 0;
+	return retval;
 }
 
 /**


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ