lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080129130403.92d0a1fe.pj@sgi.com>
Date:	Tue, 29 Jan 2008 13:04:03 -0600
From:	Paul Jackson <pj@....com>
To:	"Gregory Haskins" <ghaskins@...ell.com>
Cc:	a.p.zijlstra@...llo.nl, mingo@...e.hu, dmitry.adamushko@...il.com,
	rostedt@...dmis.org, menage@...gle.com, rientjes@...gle.com,
	tong.n.li@...el.com, tglx@...utronix.de, akpm@...ux-foundation.org,
	dhaval@...ux.vnet.ibm.com, vatsa@...ux.vnet.ibm.com,
	sgrubb@...hat.com, linux-kernel@...r.kernel.org,
	ebiederm@...ssion.com, nickpiggin@...oo.com.au
Subject: Re: scheduler scalability - cgroups, cpusets and load-balancing

Gregory wrote:
> IMHO it works well the way it is:  The user selects the class for a
> particular task using sched_setscheduler(), and they select the cpuset
> (or inherit it) that defines its execution scope.  If that scope has
> balancing enabled, the policy for the member classes is in effect.

Ok.

For the various classes of schedulers (sched_class's), it's fine by me
if sched domains are polymorphic, supporting all classes, and it is
left to each task to self-select the scheduling class of its preference.

For the batch scheduler case, this -must- be imposable from outside
the task, by the batch scheduler that is overseeing the job, and it
must support the batch scheduler being able to disable all the
balancers in selected cpusets (selected sched_domains).

We have that now.  Each of us only knew of part of the solution,
but we managed to arrive at the desired answer even so ... amazing.

The batch scheduler just has to arrange to get 'sched_load_balance'
turned off in a cpuset and all overlapping cpusets, and then the
CPUS in that cpuset will not belong to -any- sched_domain, and hence
(could you verify I'm right in this detail?) won't be balanced by any
sched_class.

I should update the documentation for sched_load_balance, changing it
from saying that you get realtime by turning off sched_load_balance in
the RT cpuset, to saying that you get realtime by (1) turning off
sched_load_balance in any overlapping cpusets, including all
encompassing parent cpusets, (2) leaving sched_load_balance on in the
RT cpuset itself, and (3) having those realtime tasks each self-select
(elect) the desired SCHED_* using sched_setscheduler().

Condition (1) above is a tad difficult to understand, but servicable,
I guess.  The combination of (1) and (2) results in a separate
sched_domain just for the CPUs in the RT cpuset.

> (on this topic, note that I do not know if the RT-balancer will
> respect the cpuset concept of "balance-enabled" anyway.  That might
> have to be fixed)

Er eh ... it has no choice.  If the user space code has configured a
cpuset with 'sched_load_balance' turned off in that cpuset and all
overlapping cpusets, then there will not even be a sched_domain
covering those CPUs, and hence no balancer, RT or other class, will
even see those CPUs.

Unless I really don't understand the kernel/sched.c sched_domain code
(a distinct possibility), if some CPU is not in any sched_domain, then
it won't get balanced, RT or otherwise.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@....com> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ