lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <525442A4.9060709@gmail.com>
Date:	Wed, 09 Oct 2013 01:36:36 +0800
From:	Zhang Yanfei <zhangyanfei.yes@...il.com>
To:	"H. Peter Anvin" <hpa@...or.com>, Tejun Heo <tj@...nel.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J . Wysocki" <rjw@...k.pl>, lenb@...nel.org,
	Thomas Gleixner <tglx@...utronix.de>, mingo@...e.hu,
	Toshi Kani <toshi.kani@...com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Thomas Renninger <trenn@...e.de>,
	Yinghai Lu <yinghai@...nel.org>,
	Jiang Liu <jiang.liu@...wei.com>,
	Wen Congyang <wency@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	isimatu.yasuaki@...fujitsu.com, izumi.taku@...fujitsu.com,
	Mel Gorman <mgorman@...e.de>, Minchan Kim <minchan@...nel.org>,
	mina86@...a86.com, gong.chen@...ux.intel.com,
	vasilis.liaskovitis@...fitbricks.com, lwoodman@...hat.com,
	Rik van Riel <riel@...hat.com>, jweiner@...hat.com,
	prarit@...hat.com, "x86@...nel.org" <x86@...nel.org>,
	linux-doc@...r.kernel.org,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linux MM <linux-mm@...ck.org>, linux-acpi@...r.kernel.org,
	imtangchen@...il.com, Zhang Yanfei <zhangyanfei@...fujitsu.com>,
	Tang Chen <tangchen@...fujitsu.com>
Subject: Re: [PATCH part1 v6 4/6] x86/mem-hotplug: Support initialize page
 tables in bottom-up

Hello tejun
CC: Peter

On 10/07/2013 08:00 AM, H. Peter Anvin wrote:
> On 10/03/2013 07:00 PM, Zhang Yanfei wrote:
>> From: Tang Chen <tangchen@...fujitsu.com>
>>
>> The Linux kernel cannot migrate pages used by the kernel. As a
>> result, kernel pages cannot be hot-removed. So we cannot allocate
>> hotpluggable memory for the kernel.
>>
>> In a memory hotplug system, any numa node the kernel resides in
>> should be unhotpluggable. And for a modern server, each node could
>> have at least 16GB memory. So memory around the kernel image is
>> highly likely unhotpluggable.
>>
>> ACPI SRAT (System Resource Affinity Table) contains the memory
>> hotplug info. But before SRAT is parsed, memblock has already
>> started to allocate memory for the kernel. So we need to prevent
>> memblock from doing this.
>>
>> So direct memory mapping page tables setup is the case. init_mem_mapping()
>> is called before SRAT is parsed. To prevent page tables being allocated
>> within hotpluggable memory, we will use bottom-up direction to allocate
>> page tables from the end of kernel image to the higher memory.
>>
>> Acked-by: Tejun Heo <tj@...nel.org>
>> Signed-off-by: Tang Chen <tangchen@...fujitsu.com>
>> Signed-off-by: Zhang Yanfei <zhangyanfei@...fujitsu.com>
> 
> I'm still seriously concerned about this.  This unconditionally
> introduces new behavior which may very well break some classes of
> systems -- the whole point of creating the page tables top down is
> because the kernel tends to be allocated in lower memory, which is also
> the memory that some devices need for DMA.
> 

After thinking for a while, this issue pointed by Peter seems to be really
existing. And looking back to what you suggested the allocation close to the
kernel, 

> so if we allocate memory close to the kernel image,
>   it's likely that we don't contaminate hotpluggable node.  We're
>   talking about few megs at most right after the kernel image.  I
>   can't see how that would make any noticeable difference.

You meant that the memory size is about few megs. But here, page tables
seems to be large enough in big memory machines, so that page tables will
consume the precious lower memory. So I think we may really reorder
the page table setup after we get the hotplug info in some way. Just like
we have done in patch 5, we reorder reserve_crashkernel() to be called
after initmem_init().

So do you still have any objection to the pagetable setup reorder?

-- 
Thanks.
Zhang Yanfei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ