[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0801291712450.20371@gandalf.stny.rr.com>
Date: Tue, 29 Jan 2008 17:23:28 -0500 (EST)
From: Steven Rostedt <rostedt@...dmis.org>
To: Gregory Haskins <ghaskins@...ell.com>
cc: Paul Jackson <pj@....com>, a.p.zijlstra@...llo.nl, mingo@...e.hu,
dmitry.adamushko@...il.com, menage@...gle.com, rientjes@...gle.com,
tong.n.li@...el.com, tglx@...utronix.de, akpm@...ux-foundation.org,
dhaval@...ux.vnet.ibm.com, vatsa@...ux.vnet.ibm.com,
sgrubb@...hat.com, linux-kernel@...r.kernel.org,
ebiederm@...ssion.com, nickpiggin@...oo.com.au
Subject: Re: scheduler scalability - cgroups, cpusets and load-balancing
On Tue, 29 Jan 2008, Gregory Haskins wrote:
> >
> > I would like to get our concepts clear, and terms consistent. That's
> > important for those others who would try to understand this.
>
> Very good idea. Thanks for doing this!
>
Sorry for coming in so late, I've been banging my head on different bugs
all day.
Just to clear up what our goal for the RT balancer was, and how simple it
is ;-) Basically any task that has an RT priority needs to run ASAP from
the time it woke up if there's a CPU available that it can run on and it
has a higher priority than what is currently running on that CPU.
If an RT task wakes up, and there's a CPU available somewhere for it to
run on, we want the RT task to jump to that CPU and run. RT tasks should
not be waiting around for nice load balancing that optimizes the cache
usage. But we also have a problem as well. We don't want to kill the
cache on large NUMA architectures by looking for places for on RT task to
run. With domains, we first look for a place that an RT task can run on a
local CPU in the node, if so, then place it there, otherwise look at other
nodes to run on.
Note, that the RT balancing is aggressive and not passive. That means that
the balancing takes place at the time the RT task is awoken (perhaps by
the task that is waking it) or at the time a task changes priority. It is
not passive, being that it waits for something else to migrate it
(i.e. migration thread)
Paul, I think you now understand that we don't have some scheduler domain
that is specific for RT. The scheduling class is specific to the priority
of the process and not to what domain it is in. But if you keep a domain
invisible to an RT task, that domain never needs to worry about RT tasks
migrating to it.
The code that Gregory and I have been adding was to try to migrate RT
tasks to CPUs they can run on as quick as possible without algorithms that
cause cacheline bouncing.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists