lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140728163909.GR19379@twins.programming.kicks-ass.net>
Date:	Mon, 28 Jul 2014 18:39:09 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Josef Bacik <jbacik@...com>
Cc:	x86@...nel.org, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC] [PATCH] x86: don't check numa topology when setting up
 core siblings

On Mon, Jul 28, 2014 at 12:28:39PM -0400, Josef Bacik wrote:
> We have these processors with this Cluster on die feature which shares numa
> nodes between cores on different sockets.

Uhm, what?! I know AMD has chips that have two nodes per package, but
what you say doesn't make sense.

> When booting up we were getting this
> error with COD enabled (this is a 4 socket 12 core per CPU box)
> 
>  smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 OK
>  ------------[ cut here ]------------
>  WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x82()
>  sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
>  smpboot: Booting Node   1, Processors  #6
>  Modules linked in:
>  CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.39-31_fbk12_01013_ga2de9bf #1
>  Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A03.08 05/24/2014
>   ffffffff810971d4 ffff8802748d3e48 0000000000000009 ffff8802748d3df8
>   ffffffff815bba59 ffff8802748d3e38 ffffffff8103b02b ffff8802748d3e28
>   0000000000000001 000000000000b010 0000000000012580 0000000000000000
>  Call Trace:
>   [<ffffffff810971d4>] ? print_modules+0x54/0xa0
>   [<ffffffff815bba59>] dump_stack+0x19/0x1b
>   [<ffffffff8103b02b>] warn_slowpath_common+0x6b/0xa0
>   [<ffffffff8103b101>] warn_slowpath_fmt+0x41/0x50
>   [<ffffffff815ada56>] topology_sane.isra.2+0x6f/0x82
>   [<ffffffff815ade23>] set_cpu_sibling_map+0x380/0x42c
>   [<ffffffff815adfe7>] start_secondary+0x118/0x19a
>  ---[ end trace 755dbfb52f761180 ]---
>   #7 #8 #9 #10 #11 OK
> 
> and then the /proc/cpuinfo would show "cores: 6" instead of "cores: 12" because
> the sibling map doesn't get set right. 

Yeah, looks like your topology setup is wrecked alright.

> This patch fixes this. 

No, as you say, this patch just makes the warning go away, you still
have a royally fucked topology setup.

> Now I realize
> this is probably not the correct fix but I'm an FS guy and I don't understand
> this stuff.

:-)

> Looking at the cpuflags with COD on and off there appears to be no
> difference.  The only difference I can spot is with it on we have 4 numa nodes
> and with it off we have 2, but that seems like a flakey check at best to add.
> I'm open to suggestions on how to fix this properly.  Thanks,

Got a link that explains this COD nonsense?

Google gets me something about Intel SSSC, but nothing that explains
your BIOS? knob.

I suspect your BIOS is buggy and doesn't properly modify the CPUID
topology data.


Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ