lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190725160207.19579-1-osalvador@suse.de>
Date:   Thu, 25 Jul 2019 18:02:02 +0200
From:   Oscar Salvador <osalvador@...e.de>
To:     akpm@...ux-foundation.org
Cc:     dan.j.williams@...el.com, david@...hat.com,
        pasha.tatashin@...een.com, mhocko@...e.com,
        anshuman.khandual@....com, Jonathan.Cameron@...wei.com,
        vbabka@...e.cz, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Oscar Salvador <osalvador@...e.de>
Subject: [PATCH v3 0/5] Allocate memmap from hotadded memory

Here we go with v3.

v3 -> v2:
        * Rewrite about vmemmap pages handling.
          Prior to this version, I was (ab)using hugepages fields
          from struct page, while here I am officially adding a new
          sub-page type with the fields I need.

        * Drop MHP_MEMMAP_{MEMBLOCK,DEVICE} in favor of MHP_MEMMAP_ON_MEMORY.
          While I am still not 100% if this the right decision, and while I
          still see some gaining in having MHP_MEMMAP_{MEMBLOCK,DEVICE},
          having only one flag ease the code.
          If the user wants to allocate memmaps per memblock, it'll
          have to call add_memory() variants with memory-block granularity.

          If we happen to have a more clear usecase MHP_MEMMAP_MEMBLOCK
          flag in the future, so user does not have to bother about the way
          it calls add_memory() variants, but only pass a flag, we can add it.
          Actually, I already had the code, so add it in the future is going to be
          easy.

        * Granularity check when hot-removing memory.
          Just checking that the granularity is the same.

[Testing]

 - x86_64: small and large memblocks (128MB, 1G and 2G)

So far, only acpi memory hotplug uses the new flag.
The other callers can be changed depending on their needs.

[Coverletter]

This is another step to make memory hotplug more usable. The primary
goal of this patchset is to reduce memory overhead of the hot-added
memory (at least for SPARSEMEM_VMEMMAP memory model). The current way we use
to populate memmap (struct page array) has two main drawbacks:

a) it consumes an additional memory until the hotadded memory itself is
   onlined and
b) memmap might end up on a different numa node which is especially true
   for movable_node configuration.

a) it is a problem especially for memory hotplug based memory "ballooning"
   solutions when the delay between physical memory hotplug and the
   onlining can lead to OOM and that led to introduction of hacks like auto
   onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
   policy for the newly added memory")).

b) can have performance drawbacks.

One way to mitigate all these issues is to simply allocate memmap array
(which is the largest memory footprint of the physical memory hotplug)
from the hot-added memory itself. SPARSEMEM_VMEMMAP memory model allows
us to map any pfn range so the memory doesn't need to be online to be
usable for the array. See patch 3 for more details.
This feature is only usable when CONFIG_SPARSEMEM_VMEMMAP is set.

[Overall design]:

Implementation wise we reuse vmem_altmap infrastructure to override
the default allocator used by vmemap_populate. Once the memmap is
allocated we need a way to mark altmap pfns used for the allocation.
If MHP_MEMMAP_ON_MEMORY flag was passed, we set up the layout of the
altmap structure at the beginning of __add_pages(), and then we call
mark_vmemmap_pages().

MHP_MEMMAP_ON_MEMORY flag parameter will specify to allocate memmaps
from the hot-added range.
If callers wants memmaps to be allocated per memory block, it will
have to call add_memory() variants in memory-block granularity
spanning the whole range, while if it wants to allocate memmaps
per whole memory range, just one call will do.

Want to add 384MB (3 sections, 3 memory-blocks)
e.g:

add_memory(0x1000, size_memory_block);
add_memory(0x2000, size_memory_block);
add_memory(0x3000, size_memory_block);

or

add_memory(0x1000, size_memory_block * 3);

One thing worth mention is that vmemmap pages residing in movable memory is not a
show-stopper for that memory to be offlined/migrated away.
Vmemmap pages are just ignored in that case and they stick around until sections
referred by those vmemmap pages are hot-removed.

Oscar Salvador (5):
  mm,memory_hotplug: Introduce MHP_MEMMAP_ON_MEMORY
  mm: Introduce a new Vmemmap page-type
  mm,sparse: Add SECTION_USE_VMEMMAP flag
  mm,memory_hotplug: Allocate memmap from the added memory range for
    sparse-vmemmap
  mm,memory_hotplug: Allow userspace to enable/disable vmemmap

 arch/powerpc/mm/init_64.c      |   7 ++
 arch/s390/mm/init.c            |   6 ++
 arch/x86/mm/init_64.c          |  10 +++
 drivers/acpi/acpi_memhotplug.c |   3 +-
 drivers/base/memory.c          |  35 +++++++++-
 drivers/dax/kmem.c             |   2 +-
 drivers/hv/hv_balloon.c        |   2 +-
 drivers/s390/char/sclp_cmd.c   |   2 +-
 drivers/xen/balloon.c          |   2 +-
 include/linux/memory_hotplug.h |  37 ++++++++--
 include/linux/memremap.h       |   2 +-
 include/linux/mm.h             |  17 +++++
 include/linux/mm_types.h       |   5 ++
 include/linux/mmzone.h         |   8 ++-
 include/linux/page-flags.h     |  19 +++++
 mm/compaction.c                |   7 ++
 mm/memory_hotplug.c            | 153 +++++++++++++++++++++++++++++++++++++----
 mm/page_alloc.c                |  26 ++++++-
 mm/page_isolation.c            |  14 +++-
 mm/sparse.c                    | 116 ++++++++++++++++++++++++++++++-
 20 files changed, 441 insertions(+), 32 deletions(-)

-- 
2.12.3

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ