linux-kernel - Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e5873a6e-db77-d654-6df6-9b5017c31f70@oracle.com>
Date:   Fri, 9 Nov 2018 16:04:56 -0800
From:   anthony.yznaga@...cle.com
To:     "Kirill A. Shutemov" <kirill@...temov.name>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        aarcange@...hat.com, aneesh.kumar@...ux.ibm.com,
        akpm@...ux-foundation.org, jglisse@...hat.com,
        khandual@...ux.vnet.ibm.com, kirill.shutemov@...ux.intel.com,
        mgorman@...hsingularity.net, mhocko@...nel.org, minchan@...nel.org,
        peterz@...radead.org, rientjes@...gle.com, vbabka@...e.cz,
        willy@...radead.org, ying.huang@...el.com, nitingupta910@...il.com
Subject: Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous
 memory



On 11/09/2018 04:13 AM, Kirill A. Shutemov wrote:
> On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote:
>> The basic idea as outlined by Mel Gorman in [2] is:
>>
>> 1) On first fault in a sufficiently sized range, allocate a huge page
>>    sized and aligned block of base pages.  Map the base page
>>    corresponding to the fault address and hold the rest of the pages in
>>    reserve.
>> 2) On subsequent faults in the range, map the pages from the reservation.
>> 3) When enough pages have been mapped, promote the mapped pages and
>>    remaining pages in the reservation to a huge page.
>> 4) When there is memory pressure, release the unused pages from their
>>    reservations.
> I haven't yet read the patch in details, but I'm skeptical about the
> approach in general for few reasons:
>
> - PTE page table retracting to replace it with huge PMD entry requires
>   down_write(mmap_sem). It makes the approach not practical for many
>   multi-threaded workloads.
>
>   I don't see a way to avoid exclusive lock here. I will be glad to
>   be proved otherwise.
>
> - The promotion will also require TLB flush which might be prohibitively
>   slow on big machines.
>
> - Short living processes will fail to benefit from THP with the policy,
>   even with plenty of free memory in the system: no time to promote to THP
>   or, with synchronous promotion, cost will overweight the benefit.
>
> The goal to reduce memory overhead of THP is admirable, but we need to be
> careful not to kill THP benefit itself. The approach will reduce number of
> THP mapped in the system and/or shift their allocation to later stage of
> process lifetime.
>
> The only way I see it can be useful is if it will be possible to apply the
> policy on per-VMA basis. It will be very useful for malloc()
> implementations, for instance. But as a global policy it's no-go to me.
I agree that this should not be a global policy.  For example, it seems to me
that a VMA where MADV_HUGEPAGE has been applied should get huge
pages on first faults (I need to fix that in my implementation).
>
> Prove me wrong with performance data. :)
I'll try.  :-)

Thanks for the comments!

Anthony