linux-kernel - Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages associated with each HugeTLB page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210128222906.GA3826@localhost.localdomain>
Date:   Thu, 28 Jan 2021 23:29:06 +0100
From:   Oscar Salvador <osalvador@...e.de>
To:     David Hildenbrand <david@...hat.com>
Cc:     Muchun Song <songmuchun@...edance.com>, corbet@....net,
        mike.kravetz@...cle.com, tglx@...utronix.de, mingo@...hat.com,
        bp@...en8.de, x86@...nel.org, hpa@...or.com,
        dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
        viro@...iv.linux.org.uk, akpm@...ux-foundation.org,
        paulmck@...nel.org, mchehab+huawei@...nel.org,
        pawan.kumar.gupta@...ux.intel.com, rdunlap@...radead.org,
        oneukum@...e.com, anshuman.khandual@....com, jroedel@...e.de,
        almasrymina@...gle.com, rientjes@...gle.com, willy@...radead.org,
        mhocko@...e.com, song.bao.hua@...ilicon.com,
        naoya.horiguchi@....com, duanxiongchun@...edance.com,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v13 05/12] mm: hugetlb: allocate the vmemmap pages
 associated with each HugeTLB page

On Wed, Jan 27, 2021 at 11:36:15AM +0100, David Hildenbrand wrote:
> Extending on that, I just discovered that only x86-64, ppc64, and arm64
> really support hugepage migration.
> 
> Maybe one approach with the "magic switch" really would be to disable
> hugepage migration completely in hugepage_migration_supported(), and
> consequently making hugepage_movable_supported() always return false.

Ok, so migration would not fork for these pages, and since them would
lay in !ZONE_MOVABLE there is no guarantee we can unplug the memory.
Well, we really cannot unplug it unless the hugepage is not used
(it can be dissolved at least).

Now to the allocation-when-freeing.
Current implementation uses GFP_ATOMIC(or wants to use) + forever loop.
One of the problems I see with GFP_ATOMIC is that gives you access
to memory reserves, but there are more users using those reserves.
Then, worst-scenario case we need to allocate 16MB order-0 pages
to free up 1GB hugepage, so the question would be whether reserves
really scale to 16MB + more users accessing reserves.

As I said, if anything I would go for an optimistic allocation-try
, if we fail just refuse to shrink the pool.
User can always try to shrink it later again via /sys interface.

Since hugepages would not be longer in ZONE_MOVABLE/CMA and are not
expected to be migratable, is that ok?

Using the hugepage for the vmemmap array was brought up several times,
but that would imply fragmenting memory over time.

All in all seems to be overly complicated (I might be wrong).

> Huge pages would never get placed onto ZONE_MOVABLE/CMA and cannot be
> migrated. The problem I describe would apply (careful with using
> ZONE_MOVABLE), but well, it can at least be documented.

I am not a page allocator expert but cannot the allocation fallback
to ZONE_MOVABLE under memory shortage on other zones?

-- 
Oscar Salvador
SUSE L3