lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <70b5cf44-9f8c-2149-0920-823045be3fe1@oracle.com>
Date:   Tue, 12 Jan 2021 12:12:09 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Michal Hocko <mhocko@...e.com>,
        David Hildenbrand <david@...hat.com>
Cc:     Muchun Song <songmuchun@...edance.com>, akpm@...ux-foundation.org,
        n-horiguchi@...jp.nec.com, ak@...ux.intel.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Yang Shi <shy828301@...il.com>
Subject: Re: [PATCH v3 1/6] mm: migrate: do not migrate HugeTLB page whose
 refcount is one

On 1/12/21 6:53 AM, Michal Hocko wrote:
> On Tue 12-01-21 15:41:02, David Hildenbrand wrote:
>> On 12.01.21 15:23, Michal Hocko wrote:
>>> On Tue 12-01-21 13:16:45, Michal Hocko wrote:
>>> [...]
>>>> Well, currently pool pages are not migrateable but you are right that
>>>> this is likely something that we will need to look into in the future
>>>> and this optimization would stand in the way.
>>>
>>> After some more thinking I believe I was wrong in my last statement.
>>> This optimization shouldn't have any effect on pages on the pool as
>>> those stay at reference count 0 and they cannot be isolated either
>>> (clear_page_huge_active before it is enqueued).
>>>
>>> That being said, the migration code would still have to learn about
>>> about this pages but that is out of scope of this discussion.
>>>
>>> Sorry about the confusion from my side.
>>>
>>
>> At this point I am fairly confused what's working at what's not :D
> 
> heh, tell me something about that. Hugetlb is a maze full of land mines.
> 
>> I think this will require more thought, on how to teach
>> alloc_contig_range() (and eventually in some cases offline_pages()?) to
>> do the right thing.
> 
> Well, offlining sort of works because it retries both migrates and
> dissolves. It can fail with the later due to reservations but that can
> be expected. We can still try harder to rellocate/rebalance per numa
> pools to keep the reservation but I strongly suspect nobody has noticed
> this to be a problem so there we are.

Due to my time zone, I get to read all the previous comments before
commenting myself. :)

To be clear, this patch is handling a very specific case where a hugetlb
page was isolated for migration and after being isolated the last reference
to the page was dropped.  Normally, dropping the last reference would send
the page to free_huge_page processing.  free_huge_page processing would
either dissolve the page to the buddy allocator or more likely place the
page on the free list of the pool.  However, since isolation also holds
a reference to the page, processing is continued down the migration path.

Today there is no code to migrate free huge pages in the pool.  Only
active in use pages are migrated.  Without this patch, 'free pages' in
the very specific state above would be migrated.  But to be clear, that
was never the intention of any hugetlb migration code.  In that respect,
I believe this patch helps the current code work as designed.

David brings up the valid point that alloc_contig_range needs to deal
with free hugetlb pool pages.  However, that is code which needs to be
written as it does not exist today.  I suspect nobody thought about free
hugetlb pages when alloc_contig_range was written.  This patch should
in no way hinder development of that code.  Free huge pages have a ref
count of 0, and this code is checking for ref count of 1.

That is a long way of saying that I still agree with this patch.  The
only modification/suggestion would be enhancing the commit message as
suggested by Michal.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ