lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE9FiQXs3MyUW+hJhupJgL-t9sYmT5GdeStWD2vMjsJG+6qCrQ@mail.gmail.com>
Date:	Mon, 25 Feb 2013 16:35:36 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	Don Morris <don.morris@...com>, "H. Peter Anvin" <hpa@...or.com>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tony Luck <tony.luck@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Martin.Bligh@...ibm.com, Martin Bligh <mbligh@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Tim Gardner <tim.gardner@...onical.com>,
	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
	x86@...nel.org, a.p.zijlstra@...llo.nl, jarkko.sakkinen@...el.com,
	tangchen@...fujitsu.com
Subject: Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

On Mon, Feb 25, 2013 at 2:50 PM, Yinghai Lu <yinghai@...nel.org> wrote:
> On Mon, Feb 25, 2013 at 1:27 PM, Don Morris <don.morris@...com> wrote:
>> On 02/25/2013 10:32 AM, Tim Gardner wrote:
>>> On 02/25/2013 08:02 AM, Tim Gardner wrote:
>>>> Is this an expected warning ? I'll boot a vanilla kernel just to be sure.
>>>>
>>>> rebased against ab7826595e9ec51a51f622c5fc91e2f59440481a in Linus' repo:
>>>>
>>>
>>> Same with a vanilla kernel, so it doesn't appear that any Ubuntu cruft
>>> is having an impact:
>>
>> Reproduced on a HP z620 workstation (E5-2620 instead of E5-2680, but
>> still Sandy Bridge, though I don't think that matters).
>>
>> Bisection leads to:
>> # bad: [e8d1955258091e4c92d5a975ebd7fd8a98f5d30f] acpi, memory-hotplug:
>> parse SRAT before memblock is ready
>>
>> Nothing terribly obvious leaps out as to *why* that reshuffling messes
>> up the cpu<-->node bindings, but I wanted to put this out there while
>> I poke around further. [Note that the SRAT: PXM -> APIC -> Node print
>> outs during boot are the same either way -- if you look at the APIC
>> numbers of the processors (from /proc/cpuinfo), the processors should
>> be assigned to the correct node, but they aren't.] cc'ing Tang Chen
>> in case this is obvious to him or he's already fixed it somewhere not
>> on Linus's tree yet.
>>
>> Don Morris
>>
>>>
>>> [    0.170435] ------------[ cut here ]------------
>>> [    0.170450] WARNING: at arch/x86/kernel/smpboot.c:324
>>> topology_sane.isra.2+0x71/0x84()
>>> [    0.170452] Hardware name: S2600CP
>>> [    0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same
>>> node! [node: 1 != 0]. Ignoring dependency.
>>> [    0.156000] smpboot: Booting Node   1, Processors  #1
>>> [    0.170455] Modules linked in:
>>> [    0.170460] Pid: 0, comm: swapper/1 Not tainted 3.8.0+ #1
>>> [    0.170461] Call Trace:
>>> [    0.170466]  [<ffffffff810597bf>] warn_slowpath_common+0x7f/0xc0
>>> [    0.170473]  [<ffffffff810598b6>] warn_slowpath_fmt+0x46/0x50
>>> [    0.170477]  [<ffffffff816cc752>] topology_sane.isra.2+0x71/0x84
>>> [    0.170482]  [<ffffffff816cc9de>] set_cpu_sibling_map+0x23f/0x436
>>> [    0.170487]  [<ffffffff816ccd0c>] start_secondary+0x137/0x201
>>> [    0.170502] ---[ end trace 09222f596307ca1d ]---
>
> that commit is totally broken, and it should be reverted.
>
> 1. numa_init is called several times, NOT just for srat. so those
>    nodes_clear(numa_nodes_parsed)
>    memset(&numa_meminfo, 0, sizeof(numa_meminfo))
> can not be just removed.
> please consider sequence is: numaq, srat, amd, dummy.
> You need to make fall back path working!
>
> 2. simply split acpi_numa_init to early_parse_srat.
> a. that early_parse_srat is NOT called for ia64, so you break ia64.
> b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
>      set_apicid_to_node(i, NUMA_NO_NODE)
> still left in numa_init. So it will just clear result from early_parse_srat.
> it should be moved before that....
>
> 3. that patch TITLE is total misleading, there is NO x86 in the title,
> but it changes
> to x86 code.
>
> 4, it does not CC to TJ and other numa guys...

attached workaround the problem for now.
but it will assume NUMAQ would not have SRAT table.

Martin, can you confirm that numaq does not have srat?

Yinghai

Download attachment "x.patch" of type "application/octet-stream" (2428 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ