linux-kernel - Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <41160c2e-817d-3ef2-0475-4db58827c1c3@redhat.com>
Date:   Mon, 1 Feb 2021 17:10:27 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Oscar Salvador <osalvador@...e.de>
Cc:     Jonathan Corbet <corbet@....net>,
        Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
        bp@...en8.de, x86@...nel.org, hpa@...or.com,
        dave.hansen@...ux.intel.com, luto@...nel.org,
        Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
        Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
        mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
        Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
        anshuman.khandual@....com, jroedel@...e.de,
        Mina Almasry <almasrymina@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...e.com>,
        "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>,
        HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>,
        Xiongchun duan <duanxiongchun@...edance.com>,
        linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v13 05/12] mm: hugetlb: allocate the
 vmemmap pages associated with each HugeTLB page

>> What's your opinion about this? Should we take this approach?
> 
> I think trying to solve all the issues that could happen as the result of
> not being able to dissolve a hugetlb page has made this extremely complex.
> I know this is something we need to address/solve.  We do not want to add
> more unexpected behavior in corner cases.  However, I can not help but think
> about similar issues today.  For example, if a huge page is in use in
> ZONE_MOVABLE or CMA there is no guarantee that it can be migrated today.

Yes, hugetlbfs is broken with alloc_contig_range() as e.g., used by CMA 
and needs fixing. Then, similar problems as with hugetlbfs pages on 
ZONE_MOVABLE apply.

hugetlbfs pages on ZONE_MOVABLE for memory unplug are problematic in 
corner cases only I think:

1. Not sufficient memory to allocate a destination page. Well, nothing 
we can really do about that - just like trying to migrate any other 
memory but running into -ENOMEM.

2. Trying to dissolve a free huge page but running into reservation 
limits. I think we should at least try allocating a new free huge page 
before failing. To be tackled in the future.

> Correct?  We may need to allocate another huge page for the target of the
> migration, and there is no guarantee we can do that.
> 

I agree that 1. is similar to "cannot migrate because OOM".

So thinking about it again, we don't actually seem to lose that much when

a) Rejecting migration of a huge page when not being able to allocate 
the vmemmap for our source page. Our system seems to be under quite some 
memory pressure already. Migration could just fail because we fail to 
allocate a migration target already.

b) Rejecting to dissolve a huge page when not able to allocate the 
vmemmap. Dissolving can fail already. And, again, our system seems to be 
under quite some memory pressure already.

c) Rejecting freeing huge pages when not able to allocate the vmemmap. I 
guess the "only" surprise is that the user might now no longer get what 
he asked for. This seems to be the "real change".

So maybe little actually speaks against allowing for migration of such 
huge pages and optimizing any huge page, besides rejecting freeing of 
huge pages and surprising the user/admin.

I guess while our system is under memory pressure CMA and ZONE_MOVABLE 
are already no longer able to always keep their guarantees - until there 
is no more memory pressure.

-- 
Thanks,

David / dhildenb