[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0806091447300.10680@chino.kir.corp.google.com>
Date: Mon, 9 Jun 2008 15:07:43 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Max Krasnyanskiy <maxk@...lcomm.com>
cc: Paul Jackson <pj@....com>, mingo@...e.hu, peterz@...radead.org,
menage@...gle.com, linux-kernel@...r.kernel.org,
Oleg Nesterov <oleg@...sign.ru>
Subject: Re: [patch] sched: prevent bound kthreads from changing
cpus_allowed
On Mon, 9 Jun 2008, Max Krasnyanskiy wrote:
> Actually I have another use case here. Above example in particular may be ok
> but it does demonstrate the issue nicely. Which is that in some cases kthreads
> are bound to a CPU but do not have a strict "must run here" requirement and
> could be moved if needed.
That isn't a completely accurate description of the patch; the kthread
itself is always allowed to change its cpu affinity. This exception, for
example, allows stopmachine to still work correctly since it uses
set_cpus_allowed_ptr() for each thread.
This patch simply prohibits other tasks from changing the cpu affinity of
a kthread that has called kthread_bind().
> For example I need an ability to move workqueue threads.
Let's elaborate a little on that: you're moving workqueue threads from
their originating cpu to another cpu (hopefully on the same NUMA node)
using cpusets or sched_setaffinity()?
> Workqueue threads do
> kthread_bind().
> So how about we add something like kthread_bind_strict() which would set
> PF_THREAD_BOUND ?
> We could also simply add flags argument to the kthread_bind() which would be
> better imo but requires more changes. ie It'd look like
> kthread_bind(..., cpu, KTHREAD_BIND_STRICT);
>
> Things like migration threads, stop machine, etc would use the strict version
> and everything else would use non-strict bind.
>
It depends on the number of exceptions that we want to allow. If there's
one or two, it's sufficient to just use
p->flags &= ~PF_THREAD_BOUND;
upon return from kthread_bind(), or simply convert the existing code to
use set_cpus_allowed_ptr() instead:
kthread_bind(p, cpu);
becomes
cpumask_t mask = cpumask_of_cpu(cpu);
set_cpus_allowed_ptr(p, &mask);
Or, if we have more exceptions to the rule than actual strict binding for
kthreads using kthread_bind(), we can just add
p->flags |= PF_THREAD_BOUND;
upon return on watchdog and migration threads.
But, to me, the semantics of kthread_bind() are pretty clear. I think
it's dangerous to move kthreads such as watchdog or migration threads out
from under them and that is easily shown if you try to do it. There are
perhaps exceptions to the rule where existing kthread_bind() calls could
be converted to set_cpus_allowed_ptr(), but I think we should enumerate
those cases and discuss the hazards that could be associated with changing
their cpu affinity.
I'd also like to hear why you choose to move your workqueue threads off
their originating cpu.
> On the related note (this seems like the right crowd :). What do people think
> about kthreads and cpusets in general. We currently have a bit of a disconnect
> in the logic.
> 1. kthreads can be put into a cpuset at which point their cpus_allowed mask is
> updated properly
> 2. kthread's cpus_allowed mask is updated properly when cpuset setup changes
> (cpus added, removed, etc).
> 3. kthreads inherit cpuset from a parent (kthreadd for example) _but_ they
> either do kthread_bind() or set_cpus_allowed() and both of those simply ignore
> inherited cpusets.
>
With my patch, this is slightly different. Kthreads that have called
kthread_bind() can have a different cpu affinity than the cpus_allowed of
their cpuset. This happens when such a kthread is attached to a cpuset
and its 'cpus' file subsequently changes. Cpusets is written correctly to
use set_cpus_allowed_ptr() so that this change in affinity is now rejected
for PF_THREAD_BOUND tasks, yet the kthread is still a member of the
cpuset.
It's debatable whether the kthread should still remain as a member of the
cpuset, but there are significant advantages: cpusets offers more than
just a simple way to do sched_setaffinity() over an aggregate of tasks.
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists