lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5258E627.7020303@cn.fujitsu.com>
Date:	Sat, 12 Oct 2013 14:03:19 +0800
From:	Zhang Yanfei <zhangyanfei@...fujitsu.com>
To:	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J . Wysocki" <rjw@...k.pl>, Len Brown <lenb@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Tejun Heo <tj@...nel.org>, Toshi Kani <toshi.kani@...com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Thomas Renninger <trenn@...e.de>,
	Yinghai Lu <yinghai@...nel.org>,
	Jiang Liu <jiang.liu@...wei.com>,
	Wen Congyang <wency@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Taku Izumi <izumi.taku@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>, Minchan Kim <minchan@...nel.org>,
	"mina86@...a86.com" <mina86@...a86.com>,
	"gong.chen@...ux.intel.com" <gong.chen@...ux.intel.com>,
	Vasilis Liaskovitis <vasilis.liaskovitis@...fitbricks.com>,
	"lwoodman@...hat.com" <lwoodman@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	"jweiner@...hat.com" <jweiner@...hat.com>,
	Prarit Bhargava <prarit@...hat.com>
CC:	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linux MM <linux-mm@...ck.org>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	Chen Tang <imtangchen@...il.com>,
	Tang Chen <tangchen@...fujitsu.com>,
	Zhang Yanfei <zhangyanfei.yes@...il.com>
Subject: [PATCH part2 v2 1/8] x86: get pg_data_t's memory from other node

From: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>

If system can create movable node which all memory of the node is allocated
as ZONE_MOVABLE, setup_node_data() cannot allocate memory for the node's
pg_data_t. So, invoke memblock_alloc_nid(...MAX_NUMNODES) again to retry when
the first allocation fails. Otherwise, the system could failed to boot.
(We don't use memblock_alloc_try_nid() to retry because in this function,
if the allocation fails, it will panic the system.)

The node_data could be on hotpluggable node. And so could pagetable and
vmemmap. But for now, doing so will break memory hot-remove path.

A node could have several memory devices. And the device who holds node
data should be hot-removed in the last place. But in NUMA level, we don't
know which memory_block (/sys/devices/system/node/nodeX/memoryXXX) belongs
to which memory device. We only have node. So we can only do node hotplug.

But in virtualization, developers are now developing memory hotplug in qemu,
which support a single memory device hotplug. So a whole node hotplug will
not satisfy virtualization users.

So at last, we concluded that we'd better do memory hotplug and local node
things (local node node data, pagetable, vmemmap, ...) in two steps.
Please refer to https://lkml.org/lkml/2013/6/19/73

For now, we put node_data of movable node to another node, and then improve
it in the future.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
Signed-off-by: Tang Chen <tangchen@...fujitsu.com>
Signed-off-by: Jiang Liu <jiang.liu@...wei.com>
Signed-off-by: Tang Chen <tangchen@...fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@...fujitsu.com>
Reviewed-by: Wanpeng Li <liwanp@...ux.vnet.ibm.com>
Acked-by: Toshi Kani <toshi.kani@...com>
---
 arch/x86/mm/numa.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 24aec58..e17db5d 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -211,9 +211,14 @@ static void __init setup_node_data(int nid, u64 start, u64 end)
 	 */
 	nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES, nid);
 	if (!nd_pa) {
-		pr_err("Cannot find %zu bytes in node %d\n",
-		       nd_size, nid);
-		return;
+		pr_warn("Cannot find %zu bytes in node %d, so try other nodes",
+			nd_size, nid);
+		nd_pa = memblock_alloc_nid(nd_size, SMP_CACHE_BYTES,
+					   MAX_NUMNODES);
+		if (!nd_pa) {
+			pr_err("Cannot find %zu bytes in any node\n", nd_size);
+			return;
+		}
 	}
 	nd = __va(nd_pa);
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ