[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <259b9669-0515-01a2-d714-617011f87194@redhat.com>
Date: Tue, 26 Jan 2021 16:10:53 +0100
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Muchun Song <songmuchun@...edance.com>, corbet@....net,
mike.kravetz@...cle.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
viro@...iv.linux.org.uk, akpm@...ux-foundation.org,
paulmck@...nel.org, mchehab+huawei@...nel.org,
pawan.kumar.gupta@...ux.intel.com, rdunlap@...radead.org,
oneukum@...e.com, anshuman.khandual@....com, jroedel@...e.de,
almasrymina@...gle.com, rientjes@...gle.com, willy@...radead.org,
mhocko@...e.com, song.bao.hua@...ilicon.com,
naoya.horiguchi@....com, duanxiongchun@...edance.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages
associated with each HugeTLB page
On 26.01.21 15:58, Oscar Salvador wrote:
> On Tue, Jan 26, 2021 at 10:36:21AM +0100, David Hildenbrand wrote:
>> I think either keep it completely simple (only free vmemmap of hugetlb
>> pages allocated early during boot - which is what's not sufficient for
>> some use cases) or implement the full thing properly (meaning, solve
>> most challenging issues to get the basics running).
>>
>> I don't want to have some easy parts of complex features merged (e.g.,
>> breaking other stuff as you indicate below), and later finding out "it's
>> not that easy" again and being stuck with it forever.
>
> Well, we could try to do an optimistic allocation, without tricky loopings.
> If that fails, refuse to shrink the pool at that moment.
>
> The user could always try to shrink it later via /proc/sys/vm/nr_hugepages
> interface.
>
> But I am just thinking out loud..
The real issue seems to be discarding the vmemmap on any memory that has
movability constraints - CMA and ZONE_MOVABLE; otherwise, as discussed,
we can reuse parts of the thingy we're freeing for the vmemmap. Not that
it would be ideal: that once-a-huge-page thing will never ever be a huge
page again - but if it helps with OOM in corner cases, sure.
Possible simplification: don't perform the optimization for now with
free huge pages residing on ZONE_MOVABLE or CMA. Certainly not perfect:
what happens when migrating a huge page from ZONE_NORMAL to
(ZONE_MOVABLE|CMA)?
>
>>> Of course, this means that e.g: memory-hotplug (hot-remove) will not fully work
>>> when this in place, but well.
>>
>> Can you elaborate? Are we're talking about having hugepages in
>> ZONE_MOVABLE that are not migratable (and/or dissolvable) anymore? Than
>> a clear NACK from my side.
>
> Pretty much, yeah.
Note that we most likely soon have to tackle migrating/dissolving (free)
hugetlbfs pages from alloc_contig_range() context - e.g., for CMA
allocations. That's certainly something to keep in mind regarding any
approaches that already break offline_pages().
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists