lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 May 2016 20:24:02 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Matt Fleming <matt@...eblueprint.co.uk>
Cc:	mingo@...nel.org, linux-kernel@...r.kernel.org, clm@...com,
	mgalbraith@...e.de, tglx@...utronix.de, fweisbec@...il.com,
	srikar@...ux.vnet.ibm.com, mikey@...ling.org, anton@...ba.org
Subject: Re: [RFC][PATCH 4/7] sched: Replace sd_busy/nr_busy_cpus with
 sched_domain_shared

On Wed, May 11, 2016 at 02:33:45PM +0200, Peter Zijlstra wrote:
> Hmm, PPC folks; what does your topology look like?
> 
> Currently your sched_domain_topology, as per arch/powerpc/kernel/smp.c
> seems to suggest your cores do not share cache at all.
> 
> https://en.wikipedia.org/wiki/POWER7 seems to agree and states
> 
>   "4 MB L3 cache per C1 core"
> 
> And http://www-03.ibm.com/systems/resources/systems_power_software_i_perfmgmt_underthehood.pdf
> also explicitly draws pictures with the L3 per core.
> 
> _however_, that same document describes L3 inter-core fill and lateral
> cast-out, which sounds like the L3s work together to form a node wide
> caching system.
> 
> Do we want to model this co-operative L3 slices thing as a sort of
> node-wide LLC for the purpose of the scheduler ?

Going back a generation; Power6 seems to have a shared L3 (off package)
between the two cores on the package. The current topology does not
reflect that at all.

And going forward a generation; Power8 seems to share the per-core
(chiplet) L3 amonst all cores (chiplets) + is has the centaur (memory
controller) 16M L4.

So it seems the current topology setup is not describing these chips
very well. Also note that the arch topology code can runtime select a
topology, so you could make that topo setup micro-arch specific.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ