linux-kernel - Re: [PATCH v3 1/6] mm: migrate: do not migrate HugeTLB page whose refcount is one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210112145325.GS22493@dhcp22.suse.cz>
Date:   Tue, 12 Jan 2021 15:53:25 +0100
From:   Michal Hocko <mhocko@...e.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Muchun Song <songmuchun@...edance.com>, mike.kravetz@...cle.com,
        akpm@...ux-foundation.org, n-horiguchi@...jp.nec.com,
        ak@...ux.intel.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Yang Shi <shy828301@...il.com>
Subject: Re: [PATCH v3 1/6] mm: migrate: do not migrate HugeTLB page whose
 refcount is one

On Tue 12-01-21 15:41:02, David Hildenbrand wrote:
> On 12.01.21 15:23, Michal Hocko wrote:
> > On Tue 12-01-21 13:16:45, Michal Hocko wrote:
> > [...]
> >> Well, currently pool pages are not migrateable but you are right that
> >> this is likely something that we will need to look into in the future
> >> and this optimization would stand in the way.
> > 
> > After some more thinking I believe I was wrong in my last statement.
> > This optimization shouldn't have any effect on pages on the pool as
> > those stay at reference count 0 and they cannot be isolated either
> > (clear_page_huge_active before it is enqueued).
> > 
> > That being said, the migration code would still have to learn about
> > about this pages but that is out of scope of this discussion.
> > 
> > Sorry about the confusion from my side.
> > 
> 
> At this point I am fairly confused what's working at what's not :D

heh, tell me something about that. Hugetlb is a maze full of land mines.

> I think this will require more thought, on how to teach
> alloc_contig_range() (and eventually in some cases offline_pages()?) to
> do the right thing.

Well, offlining sort of works because it retries both migrates and
dissolves. It can fail with the later due to reservations but that can
be expected. We can still try harder to rellocate/rebalance per numa
pools to keep the reservation but I strongly suspect nobody has noticed
this to be a problem so there we are.

-- 
Michal Hocko
SUSE Labs