lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Feb 2013 18:06:38 -0800
From:	Yinghai Lu <yinghai@...nel.org>
To:	Don Morris <don.morris@...com>, "H. Peter Anvin" <hpa@...or.com>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tony Luck <tony.luck@...el.com>, Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	mbligh@...igh.org
Cc:	Tim Gardner <tim.gardner@...onical.com>,
	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
	x86@...nel.org, a.p.zijlstra@...llo.nl, jarkko.sakkinen@...el.com,
	tangchen@...fujitsu.com
Subject: Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

[ Add new address with Martin]

On Mon, Feb 25, 2013 at 4:35 PM, Yinghai Lu <yinghai@...nel.org> wrote:
> On Mon, Feb 25, 2013 at 2:50 PM, Yinghai Lu <yinghai@...nel.org> wrote:
>> On Mon, Feb 25, 2013 at 1:27 PM, Don Morris <don.morris@...com> wrote:
>>> On 02/25/2013 10:32 AM, Tim Gardner wrote:
>>>> On 02/25/2013 08:02 AM, Tim Gardner wrote:
>>>>> Is this an expected warning ? I'll boot a vanilla kernel just to be sure.
>>>>>
>>>>> rebased against ab7826595e9ec51a51f622c5fc91e2f59440481a in Linus' repo:
>>>>>
>>>>
>>>> Same with a vanilla kernel, so it doesn't appear that any Ubuntu cruft
>>>> is having an impact:
>>>
>>> Reproduced on a HP z620 workstation (E5-2620 instead of E5-2680, but
>>> still Sandy Bridge, though I don't think that matters).
>>>
>>> Bisection leads to:
>>> # bad: [e8d1955258091e4c92d5a975ebd7fd8a98f5d30f] acpi, memory-hotplug:
>>> parse SRAT before memblock is ready
>>>
>>> Nothing terribly obvious leaps out as to *why* that reshuffling messes
>>> up the cpu<-->node bindings, but I wanted to put this out there while
>>> I poke around further. [Note that the SRAT: PXM -> APIC -> Node print
>>> outs during boot are the same either way -- if you look at the APIC
>>> numbers of the processors (from /proc/cpuinfo), the processors should
>>> be assigned to the correct node, but they aren't.] cc'ing Tang Chen
>>> in case this is obvious to him or he's already fixed it somewhere not
>>> on Linus's tree yet.
>>>
>>> Don Morris
>>>
>>>>
>>>> [    0.170435] ------------[ cut here ]------------
>>>> [    0.170450] WARNING: at arch/x86/kernel/smpboot.c:324
>>>> topology_sane.isra.2+0x71/0x84()
>>>> [    0.170452] Hardware name: S2600CP
>>>> [    0.170454] sched: CPU #1's llc-sibling CPU #0 is not on the same
>>>> node! [node: 1 != 0]. Ignoring dependency.
>>>> [    0.156000] smpboot: Booting Node   1, Processors  #1
>>>> [    0.170455] Modules linked in:
>>>> [    0.170460] Pid: 0, comm: swapper/1 Not tainted 3.8.0+ #1
>>>> [    0.170461] Call Trace:
>>>> [    0.170466]  [<ffffffff810597bf>] warn_slowpath_common+0x7f/0xc0
>>>> [    0.170473]  [<ffffffff810598b6>] warn_slowpath_fmt+0x46/0x50
>>>> [    0.170477]  [<ffffffff816cc752>] topology_sane.isra.2+0x71/0x84
>>>> [    0.170482]  [<ffffffff816cc9de>] set_cpu_sibling_map+0x23f/0x436
>>>> [    0.170487]  [<ffffffff816ccd0c>] start_secondary+0x137/0x201
>>>> [    0.170502] ---[ end trace 09222f596307ca1d ]---
>>
>> that commit is totally broken, and it should be reverted.
>>
>> 1. numa_init is called several times, NOT just for srat. so those
>>    nodes_clear(numa_nodes_parsed)
>>    memset(&numa_meminfo, 0, sizeof(numa_meminfo))
>> can not be just removed.
>> please consider sequence is: numaq, srat, amd, dummy.
>> You need to make fall back path working!
>>
>> 2. simply split acpi_numa_init to early_parse_srat.
>> a. that early_parse_srat is NOT called for ia64, so you break ia64.
>> b.  for (i = 0; i < MAX_LOCAL_APIC; i++)
>>      set_apicid_to_node(i, NUMA_NO_NODE)
>> still left in numa_init. So it will just clear result from early_parse_srat.
>> it should be moved before that....
>>
>> 3. that patch TITLE is total misleading, there is NO x86 in the title,
>> but it changes
>> to x86 code.
>>
>> 4, it does not CC to TJ and other numa guys...
>
> attached workaround the problem for now.
> but it will assume NUMAQ would not have SRAT table.
>

 Martin, can you confirm that numaq does not have srat?

Thanks

Yinghai

Download attachment "x.patch" of type "application/octet-stream" (2428 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ