[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d9dbefb8-052e-7cb5-3de4-245d05270ff9@redhat.com>
Date: Fri, 25 Jan 2019 09:53:35 +0100
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>, linux-mm@...ck.org
Cc: mhocko@...e.com, dan.j.williams@...el.com,
Pavel.Tatashin@...rosoft.com, linux-kernel@...r.kernel.org,
dave.hansen@...el.com
Subject: Re: [RFC PATCH v2 0/4] mm, memory_hotplug: allocate memmap from
hotadded memory
On 22.01.19 11:37, Oscar Salvador wrote:
> Hi,
>
> this is the v2 of the first RFC I sent back then in October [1].
> In this new version I tried to reduce the complexity as much as possible,
> plus some clean ups.
>
> [Testing]
>
> I have tested it on "x86_64" (small/big memblocks) and on "powerpc".
> On both architectures hot-add/hot-remove online/offline operations
> worked as expected using vmemmap pages, I have not seen any issues so far.
> I wanted to try it out on Hyper-V/Xen, but I did not manage to.
> I plan to do so along this week (if time allows).
> I would also like to test it on arm64, but I am not sure I can grab
> an arm64 box anytime soon.
>
> [Coverletter]:
>
> This is another step to make the memory hotplug more usable. The primary
> goal of this patchset is to reduce memory overhead of the hot added
> memory (at least for SPARSE_VMEMMAP memory model). The current way we use
> to populate memmap (struct page array) has two main drawbacks:
>
> a) it consumes an additional memory until the hotadded memory itself is
> onlined and
> b) memmap might end up on a different numa node which is especially true
> for movable_node configuration.
>
> a) is problem especially for memory hotplug based memory "ballooning"
> solutions when the delay between physical memory hotplug and the
> onlining can lead to OOM and that led to introduction of hacks like auto
> onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory")).
>
> b) can have performance drawbacks.
>
> I have also seen hot-add operations failing on powerpc due to the fact
> that we try to use order-8 pages when populating the memmap array.
> Given 64KB base pagesize, that is 16MB.
> If we run out of those, we just fail the operation and we cannot add
> more memory.
> We could fallback to base pages as x86_64 does, but we can do better.
>
> One way to mitigate all these issues is to simply allocate memmap array
> (which is the largest memory footprint of the physical memory hotplug)
> from the hotadded memory itself. VMEMMAP memory model allows us to map
> any pfn range so the memory doesn't need to be online to be usable
> for the array. See patch 3 for more details. In short I am reusing an
> existing vmem_altmap which wants to achieve the same thing for nvdim
> device memory.
>
I only had a quick glimpse. I would prefer if the caller of add_memory()
can specify whether it would be ok to allocate vmmap from the range.
This e.g. allows ACPI dimm code to allocate from the range, however
other machanisms (XEN, hyper-v, virtio-mem) can allow it once they
actually support it.
Also, while s390x standby memory cannot support allocating from the
range, virtio-mem could easily support it on s390x.
Not sure how such an interface could look like, but I would really like
to have control over that on the add_memory() interface, not per arch.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists