lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170726083333.17754-1-mhocko@kernel.org>
Date:   Wed, 26 Jul 2017 10:33:28 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     linux-mm@...ck.org
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Jerome Glisse <jglisse@...hat.com>,
        Reza Arbab <arbab@...ux.vnet.ibm.com>,
        Yasuaki Ishimatsu <yasu.isimatu@...il.com>,
        qiuxishi@...wei.com, Kani Toshimitsu <toshi.kani@....com>,
        slaoub@...il.com, Joonsoo Kim <js1304@...il.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Daniel Kiper <daniel.kiper@...cle.com>,
        Igor Mammedov <imammedo@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Dan Williams <dan.j.williams@...el.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Michal Hocko <mhocko@...e.com>,
        Paul Mackerras <paulus@...ba.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tony Luck <tony.luck@...el.com>,
        Will Deacon <will.deacon@....com>
Subject: [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from hotadded memory

Hi,
this is another step to make the memory hotplug more usable. The primary
goal of this patchset is to reduce memory overhead of the hot added
memory (at least for SPARSE_VMEMMAP memory model). Currently we use
kmalloc to poppulate memmap (struct page array) which has two main
drawbacks a) it consumes an additional memory until the hotadded memory
itslef is onlined and b) memmap might end up on a different numa node
which is especially true for movable_node configuration.

a) is problem especially for memory hotplug based memory "ballooning"
solutions when the delay between physical memory hotplug and the
onlining can lead to OOM and that led to introduction of hacks like auto
onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
policy for the newly added memory")).
b) can have performance drawbacks.

One way to mitigate both issues is to simply allocate memmap array
(which is the largest memory footprint of the physical memory hotplug)
from the hotadded memory itself. VMEMMAP memory model allows us to map
any pfn range so the memory doesn't need to be online to be usable
for the array. See patch 3 for more details. In short I am reusing an
existing vmem_altmap which wants to achieve the same thing for nvdim
device memory.

I am sending this as an RFC because this has seen only a very limited
testing and I am mostly interested about opinions on the chosen
approach. I had to touch some arch code and I have no idea whether my
changes make sense there (especially ppc). Therefore I would highly
appreciate arch maintainers to check patch 2.

Patches 4 and 5 should be straightforward cleanups.

There is also one potential drawback, though. If somebody uses memory
hotplug for 1G (gigantic) hugetlb pages then this scheme will not work
for them obviously because each memory section will contain 2MB reserved
area.  I am not really sure somebody does that and how reliable that
can work actually. Nevertheless, I _believe_ that onlining more memory
into virtual machines is much more common usecase. Anyway if there ever
is a strong demand for such a usecase we have basically 3 options a)
enlarge memory sections b) enhance altmap allocation strategy and reuse
low memory sections to host memmaps of other sections on the same NUMA
node c) have the memmap allocation strategy configurable to fallback to
the current allocation.

Are there any other concerns, ideas, comments?

The patches is based on the current mmotm tree (mmotm-2017-07-12-15-11)

Diffstat says
 arch/arm64/mm/mmu.c            |  9 ++++--
 arch/ia64/mm/discontig.c       |  4 ++-
 arch/powerpc/mm/init_64.c      | 34 ++++++++++++++++------
 arch/s390/mm/vmem.c            |  7 +++--
 arch/sparc/mm/init_64.c        |  6 ++--
 arch/x86/mm/init_64.c          | 13 +++++++--
 include/linux/memory_hotplug.h |  7 +++--
 include/linux/memremap.h       | 34 +++++++++++++++-------
 include/linux/mm.h             | 25 ++++++++++++++--
 include/linux/page-flags.h     | 18 ++++++++++++
 kernel/memremap.c              |  6 ----
 mm/compaction.c                |  3 ++
 mm/memory_hotplug.c            | 66 +++++++++++++++++++-----------------------
 mm/page_alloc.c                | 25 ++++++++++++++--
 mm/page_isolation.c            | 11 ++++++-
 mm/sparse-vmemmap.c            | 13 +++++++--
 mm/sparse.c                    | 36 ++++++++++++++++-------
 17 files changed, 223 insertions(+), 94 deletions(-)

Shortlog
Michal Hocko (5):
      mm, memory_hotplug: cleanup memory offline path
      mm, arch: unify vmemmap_populate altmap handling
      mm, memory_hotplug: allocate memmap from the added memory range for sparse-vmemmap
      mm, sparse: complain about implicit altmap usage in vmemmap_populate
      mm, sparse: rename kmalloc_section_memmap, __kfree_section_memmap


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ