lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 27 Feb 2013 10:14:18 +0800
From:	Tang Chen <tangchen@...fujitsu.com>
To:	Yinghai Lu <yinghai@...nel.org>
CC:	Don Morris <don.morris@...com>, "H. Peter Anvin" <hpa@...or.com>,
	Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tony Luck <tony.luck@...el.com>,
	Thomas Renninger <trenn@...e.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Tim Gardner <tim.gardner@...onical.com>,
	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
	x86@...nel.org, a.p.zijlstra@...llo.nl, jarkko.sakkinen@...el.com
Subject: Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

Hi Yinghai,

Please see below. :)

On 02/27/2013 06:44 AM, Yinghai Lu wrote:
>>> that commit is totally broken, and it should be reverted.
>>>
>>> 1. numa_init is called several times, NOT just for srat. so those
>>>     nodes_clear(numa_nodes_parsed)
>>>     memset(&numa_meminfo, 0, sizeof(numa_meminfo))
>>> can not be just removed.
>>> please consider sequence is: numaq, srat, amd, dummy.
>>> You need to make fall back path working!
>>>
>>> 2. simply split acpi_numa_init to early_parse_srat.
>>> a. that early_parse_srat is NOT called for ia64, so you break ia64.
>>> b.  for (i = 0; i<  MAX_LOCAL_APIC; i++)
>>>       set_apicid_to_node(i, NUMA_NO_NODE)
>>> still left in numa_init. So it will just clear result from early_parse_srat.
>>> it should be moved before that....
>>
>>     c.  it breaks ACPI_TABLE_OVERIDE...as the acpi table scan is moved
>> early before override from INITRD is settled.
>>
>>>
>>> 3. that patch TITLE is total misleading, there is NO x86 in the title,
>>> but it changes
>>> to x86 code.
>>>
>>> 4, it does not CC to TJ and other numa guys...
>
> After looked at the code more, thought that theory that does not let
> kernel use ram
> on hotplug area is not right.
>
> after that commit, following range can not use movable ram:
> 1. real_mode code.... well..funny, legacy cpu0 [0,1M) could be hot-removed?
> 2. dma_continguous ?
> 3. log buff ring.
> 4. initrd... why it will be freed after booting, so it could be on movable...
> 5. crashkernel for kdump...: : looks like we can not put kdump kernel
> above 4G anymore
> 6. initmem_init: it will allocate page table to setup kernel mapping
> for memory..., it should
> be with BRK and near end of max_pfn....

AFAIK, Linux kernel now cannot migrate memory used by the kernel 
because. So any memory
used by the kernel should not be on movable area.

>
> If node is hotplugable, the mem related stuff like page table and
> vmemmap could be
> on the that node without problem and should be on that node.

page tables and vmemmap are kernel memory. They should not be movable, I 
think.

>
> assume first cpu only have 1G ram, and other 31 socket will have bunch of ram
> and those cpu with ram could be hotadd and hotremoved.
> Now you want to put page table and vmemmap on first node.
> The system would not boot as not enough memory for cover whole system RAM.

Yes, you are right. And a more extreme situation has been talked about 
by HPA.

"If all the memory is hot-pluggable, then the kernel won't be able to boot."

So, please refer to commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb:
	acpi, memory-hotplug: support getting hotplug info from SRAT

I have excluded all the memory reserved by memblock, and any node that 
has memory
reserved by memblock will be set to un-hot-pluggable, which means we 
will have
enough memory (all the memory on the node) to boot the kernel. So I 
think the problem
you are talking about has been solved.

>
> e8d1955258091e4c92d5a975ebd7fd8a98f5d30f and related commits should be just
> reverted now.
>
> Thanks
>
> Yinghai
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ