[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071015193439.fe67bc4d.pj@sgi.com>
Date: Mon, 15 Oct 2007 19:34:39 -0700
From: Paul Jackson <pj@....com>
To: Paul Menage <menage@...gle.com>
Cc: rientjes@...gle.com, nickpiggin@...oo.com.au,
a.p.zijlstra@...llo.nl, balbir@...ux.vnet.ibm.com,
linux-kernel@...r.kernel.org, clg@...ibm.com,
ebiederm@...ssion.com, containers@...ts.osdl.org, serue@...ibm.com,
svaidy@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
xemul@...nvz.org
Subject: Re: [RFC] cpuset update_cgroup_cpus_allowed
> currently against an older kernel
ah .. which older kernel?
I tried it against the broken out 2.6.23-rc8-mm2 patch set,
inserting it before the task-containersv11-* patches, but
that blew up on me - three rejected hunks.
Any chance of getting this against a current cgroup (aka
container) kernel?
Could you use the diff --show-c-function option when composing
patches - they're easier to read that way - thanks.
+ if (!retval) {
+ cpus_allowed = cpuset_cpus_allowed(p);
+ if (!cpus_subset(new_mask, cpus_allowed)) {
+ /*
+ * We must have raced with a concurrent cpuset
+ * update. Just reset the cpus_allowed to the
+ * cpuset's cpus_allowed
+ */
+ new_mask = cpus_allowed;
This narrows the race, perhaps sufficiently, but I don't see that it
guarantees closure. Memory accesses to two different locations are not
guaranteed to be ordered across nodes, as best I recall. The second
line above, that rereads the cpuset cpus_allowed, could get an old
value, in essence.
cpuset update task sched_setaffinity task
------------------ ----------------------
A. write cpuset [Q] V. read cpuset [Q]
B. read task [P] W. check ok
C. write task [P] X. write task [P]
Y. reread cpuset [Q]
Z. check ok again
Two memory locations:
[P] the cpus_allowed mask in the task_struct of the
task doing the sched_setaffinity call.
[Q] the cpus_allowed mask in the cpuset of the cpuset
to which the sched_setaffinity task is attached.
Even though, from the perspective of location [P], both B. and C.
happened before X., still from the perspective of location [Q] the
rereading in Y. could return the value the cpuset cpus_allowed had
before the write in A. This could result in a task running with
a cpus_allowed that was totally outside its cpusets cpus_allowed.
I will grant that this is a narrow window. I won't loose much sleep
over it.
> - uses a priority heap to pick the processes to act on, based on start time
This adds a fair bit of code and complexity, relative to my patch.
This I do loose more sleep over. There has to be a compelling
reason for doing this.
The point that David raises, regarding the interaction of this with
hotplug, seems to be a compelling reason for doing -something-
different than my patch proposal.
I don't know yet if it compels us to this much code, however.
Any chance you could provide a patch that works against cgroups?
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@....com> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists