linux-kernel - Re: [PATCH RESEND 0/8] hugetlb: add demote/split page functionality

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5f55d761-1e29-bca3-4ca5-4015f91c7802@redhat.com>
Date:   Tue, 17 Aug 2021 20:49:19 +0200
From:   David Hildenbrand <david@...hat.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Michal Hocko <mhocko@...e.com>,
        Oscar Salvador <osalvador@...e.de>, Zi Yan <ziy@...dia.com>,
        Muchun Song <songmuchun@...edance.com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH RESEND 0/8] hugetlb: add demote/split page functionality

On 17.08.21 18:19, Mike Kravetz wrote:
> On 8/17/21 12:30 AM, David Hildenbrand wrote:
>> On 17.08.21 03:46, Andrew Morton wrote:
>>> On Mon, 16 Aug 2021 17:46:58 -0700 Mike Kravetz <mike.kravetz@...cle.com> wrote:
>>>
>>>>> It really is a ton of new code.  I think we're owed much more detail
>>>>> about the problem than the above.  To be confident that all this
>>>>> material is truly justified?
>>>>
>>>> The desired functionality for this specific use case is to simply
>>>> convert a 1G huegtlb page to 512 2MB hugetlb pages.  As mentioned
>>>>
>>>> "Converting larger to smaller hugetlb pages can be accomplished today by
>>>>    first freeing the larger page to the buddy allocator and then allocating
>>>>    the smaller pages.  However, there are two issues with this approach:
>>>>    1) This process can take quite some time, especially if allocation of
>>>>       the smaller pages is not immediate and requires migration/compaction.
>>>>    2) There is no guarantee that the total size of smaller pages allocated
>>>>       will match the size of the larger page which was freed.  This is
>>>>       because the area freed by the larger page could quickly be
>>>>       fragmented."
>>>>
>>>> These two issues have been experienced in practice.
>>>
>>> Well the first issue is quantifiable.  What is "some time"?  If it's
>>> people trying to get a 5% speedup on a rare operation because hey,
>>> bugging the kernel developers doesn't cost me anything then perhaps we
>>> have better things to be doing.
>>>
>>> And the second problem would benefit from some words to help us
>>> understand how much real-world hurt this causes, and how frequently.
>>> And let's understand what the userspace workarounds look like, etc.
>>>
>>>> A big chunk of the code changes (aprox 50%) is for the vmemmap
>>>> optimizations.  This is also the most complex part of the changes.
>>>> I added the code as interaction with vmemmap reduction was discussed
>>>> during the RFC.  It is only a performance enhancement and honestly
>>>> may not be worth the cost/risk.  I will get some numbers to measure
>>>> the actual benefit.
>>
>> If it really makes that much of a difference code/complexity wise, would it make sense to just limit denote functionality to the !vmemmap case for now?
>>
> 
> Handling vmemmap optimized huge pages is not that big of a deal.  We
> just use the existing functionality to populate vmemmap for the page
> being demoted, and free vmemmap for resulting pages of demoted size.
> 
> This obviously is not 'optimal' for demote as we will allocate more
> vmemmap pages than needed and then free the excess pages.  The complex
> part is not over allocating vmemmap and only sparsely populating vmemmap
> for the target pages of demote size.  This is all done in patches 6-8.
> I am happy to drop these patches for now.  The are the most complex (and
> ugly) of this series.  As mentioned, they do not provide any additional
> functionality.
> 

Just looking at the diffstat, that looks like a good idea to me :)

-- 
Thanks,

David / dhildenb