[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170726210657.GE21717@redhat.com>
Date: Wed, 26 Jul 2017 17:06:59 -0400
From: Jerome Glisse <jglisse@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Andrea Arcangeli <aarcange@...hat.com>,
Reza Arbab <arbab@...ux.vnet.ibm.com>,
Yasuaki Ishimatsu <yasu.isimatu@...il.com>,
qiuxishi@...wei.com, Kani Toshimitsu <toshi.kani@....com>,
slaoub@...il.com, Joonsoo Kim <js1304@...il.com>,
Andi Kleen <ak@...ux.intel.com>,
Daniel Kiper <daniel.kiper@...cle.com>,
Igor Mammedov <imammedo@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Catalin Marinas <catalin.marinas@....com>,
Dan Williams <dan.j.williams@...el.com>,
Fenghua Yu <fenghua.yu@...el.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Michael Ellerman <mpe@...erman.id.au>,
Michal Hocko <mhocko@...e.com>,
Paul Mackerras <paulus@...ba.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tony Luck <tony.luck@...el.com>,
Will Deacon <will.deacon@....com>
Subject: Re: [RFC PATCH 0/5] mm, memory_hotplug: allocate memmap from
hotadded memory
On Wed, Jul 26, 2017 at 10:33:28AM +0200, Michal Hocko wrote:
> Hi,
> this is another step to make the memory hotplug more usable. The primary
> goal of this patchset is to reduce memory overhead of the hot added
> memory (at least for SPARSE_VMEMMAP memory model). Currently we use
> kmalloc to poppulate memmap (struct page array) which has two main
> drawbacks a) it consumes an additional memory until the hotadded memory
> itslef is onlined and b) memmap might end up on a different numa node
> which is especially true for movable_node configuration.
>
> a) is problem especially for memory hotplug based memory "ballooning"
> solutions when the delay between physical memory hotplug and the
> onlining can lead to OOM and that led to introduction of hacks like auto
> onlining (see 31bc3858ea3e ("memory-hotplug: add automatic onlining
> policy for the newly added memory")).
> b) can have performance drawbacks.
>
> One way to mitigate both issues is to simply allocate memmap array
> (which is the largest memory footprint of the physical memory hotplug)
> from the hotadded memory itself. VMEMMAP memory model allows us to map
> any pfn range so the memory doesn't need to be online to be usable
> for the array. See patch 3 for more details. In short I am reusing an
> existing vmem_altmap which wants to achieve the same thing for nvdim
> device memory.
>
> I am sending this as an RFC because this has seen only a very limited
> testing and I am mostly interested about opinions on the chosen
> approach. I had to touch some arch code and I have no idea whether my
> changes make sense there (especially ppc). Therefore I would highly
> appreciate arch maintainers to check patch 2.
>
> Patches 4 and 5 should be straightforward cleanups.
>
> There is also one potential drawback, though. If somebody uses memory
> hotplug for 1G (gigantic) hugetlb pages then this scheme will not work
> for them obviously because each memory section will contain 2MB reserved
> area. I am not really sure somebody does that and how reliable that
> can work actually. Nevertheless, I _believe_ that onlining more memory
> into virtual machines is much more common usecase. Anyway if there ever
> is a strong demand for such a usecase we have basically 3 options a)
> enlarge memory sections b) enhance altmap allocation strategy and reuse
> low memory sections to host memmaps of other sections on the same NUMA
> node c) have the memmap allocation strategy configurable to fallback to
> the current allocation.
>
> Are there any other concerns, ideas, comments?
>
This does not seems to be an opt-in change ie if i am reading patch 3
correctly if an altmap is not provided to __add_pages() you fallback
to allocating from begining of zone. This will not work with HMM ie
device private memory. So at very least i would like to see some way
to opt-out of this. Maybe a new argument like bool forbid_altmap ?
Cheers,
Jérôme
Powered by blists - more mailing lists