lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Jan 2008 13:36:50 -0700
From:	"Gregory Haskins" <ghaskins@...ell.com>
To:	"Paul Jackson" <pj@....com>
Cc:	<a.p.zijlstra@...llo.nl>, <mingo@...e.hu>,
	<dmitry.adamushko@...il.com>, <rostedt@...dmis.org>,
	<menage@...gle.com>, <rientjes@...gle.com>, <tong.n.li@...el.com>,
	<tglx@...utronix.de>, <akpm@...ux-foundation.org>,
	<dhaval@...ux.vnet.ibm.com>, <vatsa@...ux.vnet.ibm.com>,
	<sgrubb@...hat.com>, <linux-kernel@...r.kernel.org>,
	<ebiederm@...ssion.com>, <nickpiggin@...oo.com.au>
Subject: Re: scheduler scalability - cgroups, cpusets and
	load-balancing

>>> On Tue, Jan 29, 2008 at  2:04 PM, in message
<20080129130403.92d0a1fe.pj@....com>, Paul Jackson <pj@....com> wrote: 
> Gregory wrote:
>> IMHO it works well the way it is:  The user selects the class for a
>> particular task using sched_setscheduler(), and they select the cpuset
>> (or inherit it) that defines its execution scope.  If that scope has
>> balancing enabled, the policy for the member classes is in effect.
> 
> Ok.
> 
> For the various classes of schedulers (sched_class's), it's fine by me
> if sched domains are polymorphic, supporting all classes, and it is
> left to each task to self-select the scheduling class of its preference.
> 
> For the batch scheduler case, this -must- be imposable from outside
> the task, by the batch scheduler that is overseeing the job, and it
> must support the batch scheduler being able to disable all the
> balancers in selected cpusets (selected sched_domains).
> 
> We have that now.  Each of us only knew of part of the solution,
> but we managed to arrive at the desired answer even so ... amazing.
> 
> The batch scheduler just has to arrange to get 'sched_load_balance'
> turned off in a cpuset and all overlapping cpusets, and then the
> CPUS in that cpuset will not belong to -any- sched_domain, and hence
> (could you verify I'm right in this detail?) won't be balanced by any
> sched_class.

I am a little fuzzy on how this would work, so I cant say for certain.  :) But it seems like that is accurate.


> 
> I should update the documentation for sched_load_balance, changing it
> from saying that you get realtime by turning off sched_load_balance in
> the RT cpuset, to saying that you get realtime by (1) turning off
> sched_load_balance in any overlapping cpusets, including all
> encompassing parent cpusets, (2) leaving sched_load_balance on in the
> RT cpuset itself, and (3) having those realtime tasks each self-select
> (elect) the desired SCHED_* using sched_setscheduler().
> 
> Condition (1) above is a tad difficult to understand, but servicable,
> I guess.  The combination of (1) and (2) results in a separate
> sched_domain just for the CPUs in the RT cpuset.

Technically you only need (2).  I run my 4-8 core development systems in the single default global cpuset, normally.  Customers typically do use multiple sets, but we only use the vanilla balanced variety.

> 
>> (on this topic, note that I do not know if the RT-balancer will
>> respect the cpuset concept of "balance-enabled" anyway.  That might
>> have to be fixed)
> 
> Er eh ... it has no choice.  If the user space code has configured a
> cpuset with 'sched_load_balance' turned off in that cpuset and all
> overlapping cpusets, then there will not even be a sched_domain
> covering those CPUs, and hence no balancer, RT or other class, will
> even see those CPUs.
> 
> Unless I really don't understand the kernel/sched.c sched_domain code
> (a distinct possibility), if some CPU is not in any sched_domain, then
> it won't get balanced, RT or otherwise.

Heh...I cant quite wrap my head around that, but it sounds like you are correct.  The only thing I was really pointing out is that the RT code doesn't necessarily look at sched-domain flags before making balancing decisions.  So as long as that is not a requirement, I think we are all set.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ