lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1308227242.13240.56.camel@twins>
Date:	Thu, 16 Jun 2011 14:27:22 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Samuel Thibault <samuel.thibault@...-lyon.org>
Cc:	mingo@...e.hu, linux-kernel@...r.kernel.org,
	Suresh Siddha <suresh.b.siddha@...el.com>,
	Venkatesh Pallipadi <venki@...gle.com>,
	Srivatsa Vaddagiri <vatsa@...ibm.com>,
	Paul Turner <pjt@...gle.com>, Mike Galbraith <efault@....de>,
	Andreas Herrmann <andreas.herrmann3@....com>,
	Heiko Carstens <heiko.carstens@...ibm.com>
Subject: Re: "Cache" sched domains

On Thu, 2011-06-16 at 14:11 +0200, Samuel Thibault wrote:
> Hello,
> 
> We have an x86 machine whose sockets look like this in hwloc:
> 
> ┌──────────────────────────────────────────────────────────────────┐
> │Socket P#1                                                        │
> │┌────────────────────────────────────────────────────────────────┐│
> ││L3 (16MB)                                                       ││
> │└────────────────────────────────────────────────────────────────┘│
> │┌────────────────────┐┌────────────────────┐┌────────────────────┐│
> ││L2 (3072KB)         ││L2 (3072KB)         ││L2 (3072KB)         ││
> │└────────────────────┘└────────────────────┘└────────────────────┘│
> │┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
> ││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││L1 (32KB)││
> │└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
> │┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐┌─────────┐│
> ││Core P#0 ││Core P#1 ││Core P#2 ││Core P#3 ││Core P#4 ││Core P#5 ││
> ││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││┌───────┐││
> │││PU P#0 ││││PU P#4 ││││PU P#8 ││││PU P#12││││PU P#16││││PU P#20│││
> ││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││└───────┘││
> │└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘└─────────┘│
> └──────────────────────────────────────────────────────────────────┘

Pretty, bonus points for effort there.

> However, Linux does not build sched domains for the pairs of cores
> which share an L2 cache. On s390, IBM added sched domains for books,
> that is, sets of cores which share an L2 cache. The support should
> probably be added in a generic way for all archs thanks to generic cache
> information.

Yeah, sched domain generation is currently somewhat crappy.

I think you'll find you'll get that L2 domain when you enable mc/smt
power savings on !magny-cours due to this particular horror in
arch/x86/kernel/smpboot.c (possibly loosing another level due to other
crap and changing scheduler behaviour in ways you might not fancy):

const struct cpumask *cpu_coregroup_mask(int cpu)
{
	struct cpuinfo_x86 *c = &cpu_data(cpu);
	/*
	 * For perf, we return last level cache shared map.
	 * And for power savings, we return cpu_core_map
	 */
	if ((sched_mc_power_savings || sched_smt_power_savings) &&
	    !(cpu_has(c, X86_FEATURE_AMD_DCM)))
		return cpu_core_mask(cpu);
	else
		return cpu_llc_shared_mask(cpu);
}

I recently started reworking all that sched_domain crud and we're almost
at the point where we can remove all legacy 'level' crap. That is,
nothing in the scheduler should (and does, last time I checked) depend
on sd->level anymore.

So the current goal is to change sched_domain_topology to not be such a
silly hard coded list of domains, but build that thing dynamically based
on the system topology and set all the SD_flags correctly.

If that is something you're willing to work on, that'd be totally
awesome.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ