lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 12 Oct 2013 14:00:00 +0800
From:	Zhang Yanfei <zhangyanfei@...fujitsu.com>
To:	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J . Wysocki" <rjw@...k.pl>, Len Brown <lenb@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Tejun Heo <tj@...nel.org>, Toshi Kani <toshi.kani@...com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Thomas Renninger <trenn@...e.de>,
	Yinghai Lu <yinghai@...nel.org>,
	Jiang Liu <jiang.liu@...wei.com>,
	Wen Congyang <wency@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	Taku Izumi <izumi.taku@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>, Minchan Kim <minchan@...nel.org>,
	"mina86@...a86.com" <mina86@...a86.com>,
	"gong.chen@...ux.intel.com" <gong.chen@...ux.intel.com>,
	Vasilis Liaskovitis <vasilis.liaskovitis@...fitbricks.com>,
	"lwoodman@...hat.com" <lwoodman@...hat.com>,
	Rik van Riel <riel@...hat.com>,
	"jweiner@...hat.com" <jweiner@...hat.com>,
	Prarit Bhargava <prarit@...hat.com>
CC:	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linux MM <linux-mm@...ck.org>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	Chen Tang <imtangchen@...il.com>,
	Tang Chen <tangchen@...fujitsu.com>,
	Zhang Yanfei <zhangyanfei.yes@...il.com>
Subject: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

Hello guys, this is the part2 of our memory hotplug work. This part
is based on the part1:
    "x86, memblock: Allocate memory near kernel image before SRAT parsed"
which is base on 3.12-rc4.

You could refer part1 from: https://lkml.org/lkml/2013/10/10/644

Any comments are welcome! Thanks!

[Problem]

The current Linux cannot migrate pages used by the kerenl because
of the kernel direct mapping. In Linux kernel space, va = pa + PAGE_OFFSET.
When the pa is changed, we cannot simply update the pagetable and
keep the va unmodified. So the kernel pages are not migratable.

There are also some other issues will cause the kernel pages not migratable.
For example, the physical address may be cached somewhere and will be used.
It is not to update all the caches.

When doing memory hotplug in Linux, we first migrate all the pages in one
memory device somewhere else, and then remove the device. But if pages are
used by the kernel, they are not migratable. As a result, memory used by
the kernel cannot be hot-removed.

Modifying the kernel direct mapping mechanism is too difficult to do. And
it may cause the kernel performance down and unstable. So we use the following
way to do memory hotplug.


[What we are doing]

In Linux, memory in one numa node is divided into several zones. One of the
zones is ZONE_MOVABLE, which the kernel won't use.

In order to implement memory hotplug in Linux, we are going to arrange all
hotpluggable memory in ZONE_MOVABLE so that the kernel won't use these memory.

To do this, we need ACPI's help.


[How we do this]

In ACPI, SRAT(System Resource Affinity Table) contains NUMA info. The memory
affinities in SRAT record every memory range in the system, and also, flags
specifying if the memory range is hotpluggable.
(Please refer to ACPI spec 5.0 5.2.16)

With the help of SRAT, we have to do the following two things to achieve our
goal:

1. When doing memory hot-add, allow the users arranging hotpluggable as
   ZONE_MOVABLE.
   (This has been done by the MOVABLE_NODE functionality in Linux.)

2. when the system is booting, prevent bootmem allocator from allocating
   hotpluggable memory for the kernel before the memory initialization
   finishes.
   (This is what we are going to do. See below.)


[About this patch-set]

In previous part's patches, we have made the kernel allocate memory near
kernel image before SRAT parsed to avoid allocating hotpluggable memory
for kernel. So this patch-set does the following things:

1. Improve memblock to support flags, which are used to indicate different 
   memory type.

2. Mark all hotpluggable memory in memblock.memory[].

3. Make the default memblock allocator skip hotpluggable memory.

4. Improve "movable_node" boot option to have higher priority of movablecore
   and kernelcore boot option.

Change log v1 -> v2:
1. Rebase this part on the v7 version of part1
2. Fix bug: If movable_node boot option not specified, memblock still
   checks hotpluggable memory when allocating memory. 

Tang Chen (7):
  memblock, numa: Introduce flag into memblock
  memblock, mem_hotplug: Introduce MEMBLOCK_HOTPLUG flag to mark
    hotpluggable regions
  memblock: Make memblock_set_node() support different memblock_type
  acpi, numa, mem_hotplug: Mark hotpluggable memory in memblock
  acpi, numa, mem_hotplug: Mark all nodes the kernel resides
    un-hotpluggable
  memblock, mem_hotplug: Make memblock skip hotpluggable regions if
    needed
  x86, numa, acpi, memory-hotplug: Make movable_node have higher
    priority

Yasuaki Ishimatsu (1):
  x86: get pg_data_t's memory from other node

 arch/metag/mm/init.c      |    3 +-
 arch/metag/mm/numa.c      |    3 +-
 arch/microblaze/mm/init.c |    3 +-
 arch/powerpc/mm/mem.c     |    2 +-
 arch/powerpc/mm/numa.c    |    8 ++-
 arch/sh/kernel/setup.c    |    4 +-
 arch/sparc/mm/init_64.c   |    5 +-
 arch/x86/mm/init_32.c     |    2 +-
 arch/x86/mm/init_64.c     |    2 +-
 arch/x86/mm/numa.c        |   63 +++++++++++++++++++++--
 arch/x86/mm/srat.c        |    5 ++
 include/linux/memblock.h  |   39 ++++++++++++++-
 mm/memblock.c             |  123 ++++++++++++++++++++++++++++++++++++++-------
 mm/memory_hotplug.c       |    1 +
 mm/page_alloc.c           |   28 ++++++++++-
 15 files changed, 252 insertions(+), 39 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ