[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140728163909.GR19379@twins.programming.kicks-ass.net>
Date: Mon, 28 Jul 2014 18:39:09 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Josef Bacik <jbacik@...com>
Cc: x86@...nel.org, tglx@...utronix.de, mingo@...hat.com,
hpa@...or.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC] [PATCH] x86: don't check numa topology when setting up
core siblings
On Mon, Jul 28, 2014 at 12:28:39PM -0400, Josef Bacik wrote:
> We have these processors with this Cluster on die feature which shares numa
> nodes between cores on different sockets.
Uhm, what?! I know AMD has chips that have two nodes per package, but
what you say doesn't make sense.
> When booting up we were getting this
> error with COD enabled (this is a 4 socket 12 core per CPU box)
>
> smpboot: Booting Node 0, Processors #1 #2 #3 #4 #5 OK
> ------------[ cut here ]------------
> WARNING: at arch/x86/kernel/smpboot.c:324 topology_sane.isra.2+0x6f/0x82()
> sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
> smpboot: Booting Node 1, Processors #6
> Modules linked in:
> CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.39-31_fbk12_01013_ga2de9bf #1
> Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A03.08 05/24/2014
> ffffffff810971d4 ffff8802748d3e48 0000000000000009 ffff8802748d3df8
> ffffffff815bba59 ffff8802748d3e38 ffffffff8103b02b ffff8802748d3e28
> 0000000000000001 000000000000b010 0000000000012580 0000000000000000
> Call Trace:
> [<ffffffff810971d4>] ? print_modules+0x54/0xa0
> [<ffffffff815bba59>] dump_stack+0x19/0x1b
> [<ffffffff8103b02b>] warn_slowpath_common+0x6b/0xa0
> [<ffffffff8103b101>] warn_slowpath_fmt+0x41/0x50
> [<ffffffff815ada56>] topology_sane.isra.2+0x6f/0x82
> [<ffffffff815ade23>] set_cpu_sibling_map+0x380/0x42c
> [<ffffffff815adfe7>] start_secondary+0x118/0x19a
> ---[ end trace 755dbfb52f761180 ]---
> #7 #8 #9 #10 #11 OK
>
> and then the /proc/cpuinfo would show "cores: 6" instead of "cores: 12" because
> the sibling map doesn't get set right.
Yeah, looks like your topology setup is wrecked alright.
> This patch fixes this.
No, as you say, this patch just makes the warning go away, you still
have a royally fucked topology setup.
> Now I realize
> this is probably not the correct fix but I'm an FS guy and I don't understand
> this stuff.
:-)
> Looking at the cpuflags with COD on and off there appears to be no
> difference. The only difference I can spot is with it on we have 4 numa nodes
> and with it off we have 2, but that seems like a flakey check at best to add.
> I'm open to suggestions on how to fix this properly. Thanks,
Got a link that explains this COD nonsense?
Google gets me something about Intel SSSC, but nothing that explains
your BIOS? knob.
I suspect your BIOS is buggy and doesn't properly modify the CPUID
topology data.
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists