linux-kernel - Re: scheduler scalability - cgroups, cpusets and load-balancing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20080129053005.bc7a11d7.pj@sgi.com>
Date:	Tue, 29 Jan 2008 05:30:05 -0600
From:	Paul Jackson <pj@....com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu,
	vatsa@...ux.vnet.ibm.com, dhaval@...ux.vnet.ibm.com,
	nickpiggin@...oo.com.au, ebiederm@...ssion.com,
	akpm@...ux-foundation.org, sgrubb@...hat.com, rostedt@...dmis.org,
	ghaskins@...ell.com, dmitry.adamushko@...il.com,
	tong.n.li@...el.com, tglx@...utronix.de, menage@...gle.com,
	rientjes@...gle.com
Subject: Re: scheduler scalability - cgroups, cpusets and load-balancing

Peter wrote, in reply to Peter ;):
> > [ It looks to me it balances a group over the largest SD the current cpu
> >   has access to, even though that might be larger than the SD associated
> >   with the cpuset of that particular cgroup. ]
> 
> Hmm, with a bit more thought I think that does indeed DTRT. Because, if
> the cpu belongs to a disjoint cpuset, the highest sd (with
> load-balancing enabled) would be that. Right?

The code that defines sched domains, kernel/sched.c partition_sched_domains(),
as called from the cpuset code in kernel/cpuset.c rebuild_sched_domains(),
does not make use of the full range of sched_domain possibilities.

In particular, it only sets up some non-overlapping set of sched domains.
Every CPU ends up in at most a single sched domain.

The original reason that one can't define overlapping sched domains via
this cpuset interface (based off the cpuset 'sched_load_balance' flag)
is that I didn't realize it was even possible to overlap sched domains
when I wrote the cpuset code defining sched domains.  And then when I
later realized one could overlap sched domains, I (a) didn't see a need
to do so, and (b) couldn't see how to do so via the cpuset interface
without causing my brain to explode.

Now, back to Peter's question, being a bit pedantic, CPUs don't belong
to disjoint cpusets, except in the most minimal situation that there is
only one cpuset covering all CPUs.

Rather what happens, when you have need for some realtime CPUs, is that:
 1) you turn off sched_load_balance on the top cpuset,
 2) you setup your realtime cpuset as a child cpuset of the top cpuset
    such that its CPUs doesn't overlap any of its siblings, and
 3) you turn off sched_load_balance in that realtime cpuset.

At that point, sched domains are rebuilt, including providing a
sched domain that just contains the CPUs in that realtime cpuset, and
normal scheduler load balancing ceases on the CPUs in that realtime
cpuset.

> [ Just a bit of a shame we have all cgroups represented on each cpu. ]

Could you restate this -- I suspect it's obvious, but I'm oblivious ;).

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@....com> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/