linux-kernel - Re: [PATCH] cpuset: fix cpuset vs PF_THREAD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0903161457040.26565@chino.kir.corp.google.com>
Date:	Mon, 16 Mar 2009 15:11:38 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>, balbir@...ibm.com,
	menage@...gle.com, vatsa@...ux.vnet.ibm.com,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	lkml <linux-kernel@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] cpuset: fix cpuset vs PF_THREAD_BOUND weirdness

On Sat, 14 Mar 2009, Peter Zijlstra wrote:

> David, I'm not sure what you're arguing.
> 
> Letting a kernel thread in a subset, but then not letting it back out
> again seems really weird to me, esp. since PF_THREAD_BOUND is a fairly
> recent thing.
> 

The addition of PF_THREAD_BOUND impacted cpusets because, for the first 
time, it was possible that setting the new set of allowed cpus wouldn't 
succeed (cpusets already deals with the hotplug exception).

That introduced a new requirement: that set_cpus_allowed_ptr() succeeds 
for the attachment of a task to the cpuset changes.  The cpu affinity of 
a task is changed to match that of the cpuset anytime it is moved and then 
imposed a superset for subsequent sched_setaffinity() calls from 
userspace.  Your change would do the latter but neglect the former if it 
doesn't succeed; I'm not immediately opposed to that, but I wasn't 
satisfied with the additional inconsistency that the earlier patch added 
(i.e. changing a cpuset's cpumask that now is disjoint from threads with a 
bound cpu).  So to be clear, I'm not disagreeing with resolving a 
perceived inconsistency with cpuset movement, but I am disagreeing with an 
incomplete solution.

Your solution appears to be more complete, although I have not tested it.  
So thanks for looking at this particular issue.

Let me propose my own patch, however, that would be less intrusive and, if 
anything, more logical.  It would also not require a change to the cgroup 
interface for attachment callbacks.

This patch simply denies PF_THREAD_BOUND tasks from ever moving out of the 
root cpuset.  These kthreads do not require mempolicies that disallow 
access to certain nodes and their cpumask can never be changed; thus, 
cpusets isn't applicable to these threads.  They will forever reside in 
the root cpuset so the interface with userspace remains consistent (i.e. 
`cat tasks' for the root cpuset will still work).

Unless a use-case can be shown for a single PF_THREAD_BOUND task where 
manipulating its cpuset attachment or mempolicy matters, I think this is 
the most appropriate solution.

Signed-off-by: David Rientjes <rientjes@...gle.com>
---
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1360,12 +1360,16 @@ static int cpuset_can_attach(struct cgroup_subsys *ss,
 	if (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))
 		return -ENOSPC;

-	if (tsk->flags & PF_THREAD_BOUND) {
-		mutex_lock(&callback_mutex);
-		if (!cpumask_equal(&tsk->cpus_allowed, cs->cpus_allowed))
-			ret = -EINVAL;
-		mutex_unlock(&callback_mutex);
-	}
+	/*
+	 * Kthreads bound to specific cpus cannot be moved to a new cpuset; we
+	 * cannot change their cpu affinity and isolating such threads by their
+	 * set of allowed nodes is unnecessary.  Thus, cpusets are not
+	 * applicable for such threads.  This prevents checking for success of
+	 * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
+	 * be changed.
+	 */
+	if (tsk->flags & PF_THREAD_BOUND)
+		return -EINVAL;

 	return ret < 0 ? ret : security_task_setscheduler(tsk, 0, NULL);
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/