lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52CBCB85.8050607@linux.vnet.ibm.com>
Date:	Tue, 07 Jan 2014 15:10:21 +0530
From:	Preeti U Murthy <preeti@...ux.vnet.ibm.com>
To:	Vincent Guittot <vincent.guittot@...aro.org>, peterz@...radead.org
CC:	linux-kernel@...r.kernel.org, mingo@...nel.org, pjt@...gle.com,
	Morten.Rasmussen@....com, cmetcalf@...era.com, tony.luck@...el.com,
	alex.shi@...aro.org, linaro-kernel@...ts.linaro.org, rjw@...k.pl,
	paulmck@...ux.vnet.ibm.com, corbet@....net, tglx@...utronix.de,
	len.brown@...el.com, arjan@...ux.intel.com,
	amit.kucheria@...aro.org, james.hogan@...tec.com,
	schwidefsky@...ibm.com, heiko.carstens@...ibm.com,
	Dietmar.Eggemann@....com
Subject: Re: [RFC] sched: CPU topology try

Hi Vincent, Peter,

On 12/18/2013 06:43 PM, Vincent Guittot wrote:
> This patch applies on top of the two patches [1][2] that have been proposed by
> Peter for creating a new way to initialize sched_domain. It includes some minor
> compilation fixes and a trial of using this new method on ARM platform.
> [1] https://lkml.org/lkml/2013/11/5/239
> [2] https://lkml.org/lkml/2013/11/5/449
> 
> Based on the results of this tests, my feeling about this new way to init the
> sched_domain is a bit mitigated.
> 
> The good point is that I have been able to create the same sched_domain
> topologies than before and even more complex ones (where a subset of the cores
> in a cluster share their powergating capabilities). I have described various
> topology results below.
> 
> I use a system that is made of a dual cluster of quad cores with hyperthreading
> for my examples.
> 
> If one cluster (0-7) can powergate its cores independantly but not the other
> cluster (8-15) we have the following topology, which is equal to what I had
> previously:
> 
> CPU0:
> domain 0: span 0-1 level: SMT
>     flags: SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>     groups: 0 1
>   domain 1: span 0-7 level: MC
>       flags: SD_SHARE_PKG_RESOURCES
>       groups: 0-1 2-3 4-5 6-7
>     domain 2: span 0-15 level: CPU
>         flags:
>         groups: 0-7 8-15
> 
> CPU8
> domain 0: span 8-9 level: SMT
>     flags: SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>     groups: 8 9
>   domain 1: span 8-15 level: MC
>       flags: SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>       groups: 8-9 10-11 12-13 14-15
>     domain 2: span 0-15 level CPU
>         flags:
>         groups: 8-15 0-7
> 
> We can even describe some more complex topologies if a susbset (2-7) of the
> cluster can't powergate independatly:
> 
> CPU0:
> domain 0: span 0-1 level: SMT
>     flags: SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>     groups: 0 1
>   domain 1: span 0-7 level: MC
>       flags: SD_SHARE_PKG_RESOURCES
>       groups: 0-1 2-7
>     domain 2: span 0-15 level: CPU
>         flags:
>         groups: 0-7 8-15
> 
> CPU2:
> domain 0: span 2-3 level: SMT
>     flags: SD_SHARE_CPUPOWER | SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>     groups: 0 1
>   domain 1: span 2-7 level: MC
>       flags: SD_SHARE_PKG_RESOURCES | SD_SHARE_POWERDOMAIN
>       groups: 2-7 4-5 6-7
>     domain 2: span 0-7 level: MC
>         flags: SD_SHARE_PKG_RESOURCES
>         groups: 2-7 0-1
>       domain 3: span 0-15 level: CPU
>           flags:
>           groups: 0-7 8-15
> 
> In this case, we have an aditionnal sched_domain MC level for this subset (2-7)
> of cores so we can trigger some load balance in this subset before doing that
> on the complete cluster (which is the last level of cache in my example)
> 
> We can add more levels that will describe other dependency/independency like
> the frequency scaling dependency and as a result the final sched_domain
> topology will have additional levels (if they have not been removed during
> the degenerate sequence)
> 
> My concern is about the configuration of the table that is used to create the
> sched_domain. Some levels are "duplicated" with different flags configuration
> which make the table not easily readable and we must also take care of the
> order  because parents have to gather all cpus of its childs. So we must
> choose which capabilities will be a subset of the other one. The order is
> almost straight forward when we describe 1 or 2 kind of capabilities
> (package ressource sharing and power sharing) but it can become complex if we
> want to add more.

What if we want to add arch specific flags to the NUMA domain? Currently
with Peter's patch:https://lkml.org/lkml/2013/11/5/239 and this patch,
the arch can modify the sd flags of the topology levels till just before
the NUMA domain. In sd_init_numa(), the flags for the NUMA domain get
initialized. We need to perhaps call into arch here to probe for
additional flags?

Thanks

Regards
Preeti U Murthy
> 
> Regards
> Vincent
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ