[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Oct 2007 13:59:44 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Paul Menage <menage@...gle.com>
cc: Paul Jackson <pj@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
nickpiggin@...oo.com.au, a.p.zijlstra@...llo.nl, serue@...ibm.com,
clg@...ibm.com, linux-kernel@...r.kernel.org,
ebiederm@...ssion.com, svaidy@...ux.vnet.ibm.com, xemul@...nvz.org,
containers@...ts.osdl.org, balbir@...ux.vnet.ibm.com
Subject: Re: [PATCH] task containersv11 add tasks file interface fix for
cpusets
On Wed, 10 Oct 2007, Paul Menage wrote:
> On 10/6/07, David Rientjes <rientjes@...gle.com> wrote:
> >
> > It can race with sched_setaffinity(). It has to give up tasklist_lock as
> > well to call set_cpus_allowed() and can race
> >
> > cpus_allowed = cpuset_cpus_allowed(p);
> > cpus_and(new_mask, new_mask, cpus_allowed);
> > retval = set_cpus_allowed(p, new_mask);
> >
> > and allow a task to have a cpu outside of the cpuset's new cpus_allowed if
> > you've taken it away between cpuset_cpus_allowed() and set_cpus_allowed().
>
> cpuset_cpus_allowed() takes callback_mutex, which is held by
> update_cpumask() when it updates cs->cpus_allowed. So if we continue
> to hold callback_mutex across the task update loop this wouldn't be a
> race. Having said that, holding callback mutex for that long might not
> be a good idea.
>
Moving the actual use of set_cpus_allowed() from the cgroup code to
sched.c would probably be the best in terms of a clean interface. Then
you could protect the whole thing by sched_hotcpu_mutex, which is
expressly designed for migrations.
Something like this:
struct cgroup_iter it;
struct task_struct *p, **tasks;
int i = 0;
cgroup_iter_start(cs->css.cgroup, &it);
while ((p = cgroup_iter_next(cs->css.cgroup, &it))) {
get_task_struct(p);
tasks[i++] = p;
}
cgroup_iter_end(cs->css.cgroup, &it);
while (--i >= 0) {
sched_migrate_task(tasks[i], cs->cpus_allowed);
put_task_struct(tasks[i]);
}
kernel/sched.c:
void sched_migrate_task(struct task_struct *task,
cpumask_t cpus_allowed)
{
mutex_lock(&sched_hotcpu_mutex);
set_cpus_allowed(task, cpus_allowed);
mutex_unlock(&sched_hotcpu_mutex);
}
It'd be faster to just pass **tasks to a sched.c function with the number
of tasks to migrate to reduce the contention on sched_hotcpu_mutex, but
then the put_task_struct() is left dangling over in cgroup code.
sched_hotcpu_mutex should be rarely contended, anyway.
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists