linux-kernel - Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <259b9669-0515-01a2-d714-617011f87194@redhat.com>
Date:   Tue, 26 Jan 2021 16:10:53 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Oscar Salvador <osalvador@...e.de>
Cc:     Muchun Song <songmuchun@...edance.com>, corbet@....net,
        mike.kravetz@...cle.com, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, x86@...nel.org, hpa@...or.com,
        dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
        viro@...iv.linux.org.uk, akpm@...ux-foundation.org,
        paulmck@...nel.org, mchehab+huawei@...nel.org,
        pawan.kumar.gupta@...ux.intel.com, rdunlap@...radead.org,
        oneukum@...e.com, anshuman.khandual@....com, jroedel@...e.de,
        almasrymina@...gle.com, rientjes@...gle.com, willy@...radead.org,
        mhocko@...e.com, song.bao.hua@...ilicon.com,
        naoya.horiguchi@....com, duanxiongchun@...edance.com,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages
 associated with each HugeTLB page

On 26.01.21 15:58, Oscar Salvador wrote:
> On Tue, Jan 26, 2021 at 10:36:21AM +0100, David Hildenbrand wrote:
>> I think either keep it completely simple (only free vmemmap of hugetlb
>> pages allocated early during boot - which is what's not sufficient for
>> some use cases) or implement the full thing properly (meaning, solve
>> most challenging issues to get the basics running).
>>
>> I don't want to have some easy parts of complex features merged (e.g.,
>> breaking other stuff as you indicate below), and later finding out "it's
>> not that easy" again and being stuck with it forever.
> 
> Well, we could try to do an optimistic allocation, without tricky loopings.
> If that fails, refuse to shrink the pool at that moment.
> 
> The user could always try to shrink it later via /proc/sys/vm/nr_hugepages
> interface.
> 
> But I am just thinking out loud..

The real issue seems to be discarding the vmemmap on any memory that has 
movability constraints - CMA and ZONE_MOVABLE; otherwise, as discussed, 
we can reuse parts of the thingy we're freeing for the vmemmap. Not that 
it would be ideal: that once-a-huge-page thing will never ever be a huge 
page again - but if it helps with OOM in corner cases, sure.

Possible simplification: don't perform the optimization for now with 
free huge pages residing on ZONE_MOVABLE or CMA. Certainly not perfect: 
what happens when migrating a huge page from ZONE_NORMAL to 
(ZONE_MOVABLE|CMA)?

> 
>>> Of course, this means that e.g: memory-hotplug (hot-remove) will not fully work
>>> when this in place, but well.
>>
>> Can you elaborate? Are we're talking about having hugepages in
>> ZONE_MOVABLE that are not migratable (and/or dissolvable) anymore? Than
>> a clear NACK from my side.
> 
> Pretty much, yeah.

Note that we most likely soon have to tackle migrating/dissolving (free) 
hugetlbfs pages from alloc_contig_range() context - e.g., for CMA 
allocations. That's certainly something to keep in mind regarding any 
approaches that already break offline_pages().

-- 
Thanks,

David / dhildenb