[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071003025518.7c184905.pj@sgi.com>
Date: Wed, 3 Oct 2007 02:55:18 -0700
From: Paul Jackson <pj@....com>
To: Nick Piggin <nickpiggin@...oo.com.au>
Cc: akpm@...ux-foundation.org, menage@...gle.com,
linux-kernel@...r.kernel.org, dino@...ibm.com, cpw@....com,
mingo@...e.hu
Subject: Re: [PATCH] cpuset and sched domains: sched_load_balance flag
> > Yeah -- cpusets are hierarchical. And some of the use cases for
> > which cpusets are designed are hierarchical.
>
> But partitioning isn't.
Yup. We've got a square peg and a round hole. An impedance mismatch.
That's the root cause of this entire wibbling session, in my view.
The essential role of cpusets, cgroups and much other such work of
recent, in my view, is pounding this square peg into that round hole.
In essence, it is fitting the hierarchical structure of the
organizations (corporations, universities and governments) who own big
systems to the flat, system-wide mandates needed to manage a given
computer system.
> > Changing cpusets from single root to multiple roots would be
> > bastardizing it.
>
> Well OK, if that's your definition. Not very helpful though.
Well, such a change would be rather substantial and undesired,
if those terms help you more.
> > To repeat myself, in some cases, such as batch schedulers running in a
> > subset of the CPUs on a large system, the code that knows some of the
> > needs for load balancing does not have system wide control to mandate
> > hard partitioning. The batch scheduler can state where it is depending
> > on load balancing being present, and the system administrator can choose
> > or not to turn off load balancing in the top cpuset, thereby granting or
> > not control over load balancing on the CPUs controlled by the batch
> > scheduler to the batch scheduler.
>
> Why isn't that possible with my approach?
If I understand your approach to the kernel-to-user interface correctly
(sometimes I doubt I do) then your approach expected some user space code
or person or semi-intelligent equivalent to define a flat partition,
which will then be used to determine the sched domains.
In the batch scheduler case, running on a large shared system used
perhaps by several departments, no one entity can do that. One person,
perhaps the system admin, knows if they want to give complete control
of some big chunk of CPUs to a batch scheduler. The batch scheduler,
written by someone else far away and long ago, knows which jobs are
actively running on which subsets of the CPUs the batch scheduler is
using.
There is no single monolithic entity on such systems who knows all and
can dictate all details of a single, flat, system-wide partitioning.
The partitioning has to be sythesized from the combined requests of
several user space entities. That's ok -- this is bread and butter
work for cpusets.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@....com> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists