[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5873a6e-db77-d654-6df6-9b5017c31f70@oracle.com>
Date: Fri, 9 Nov 2018 16:04:56 -0800
From: anthony.yznaga@...cle.com
To: "Kirill A. Shutemov" <kirill@...temov.name>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
aarcange@...hat.com, aneesh.kumar@...ux.ibm.com,
akpm@...ux-foundation.org, jglisse@...hat.com,
khandual@...ux.vnet.ibm.com, kirill.shutemov@...ux.intel.com,
mgorman@...hsingularity.net, mhocko@...nel.org, minchan@...nel.org,
peterz@...radead.org, rientjes@...gle.com, vbabka@...e.cz,
willy@...radead.org, ying.huang@...el.com, nitingupta910@...il.com
Subject: Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous
memory
On 11/09/2018 04:13 AM, Kirill A. Shutemov wrote:
> On Thu, Nov 08, 2018 at 10:48:58PM -0800, Anthony Yznaga wrote:
>> The basic idea as outlined by Mel Gorman in [2] is:
>>
>> 1) On first fault in a sufficiently sized range, allocate a huge page
>> sized and aligned block of base pages. Map the base page
>> corresponding to the fault address and hold the rest of the pages in
>> reserve.
>> 2) On subsequent faults in the range, map the pages from the reservation.
>> 3) When enough pages have been mapped, promote the mapped pages and
>> remaining pages in the reservation to a huge page.
>> 4) When there is memory pressure, release the unused pages from their
>> reservations.
> I haven't yet read the patch in details, but I'm skeptical about the
> approach in general for few reasons:
>
> - PTE page table retracting to replace it with huge PMD entry requires
> down_write(mmap_sem). It makes the approach not practical for many
> multi-threaded workloads.
>
> I don't see a way to avoid exclusive lock here. I will be glad to
> be proved otherwise.
>
> - The promotion will also require TLB flush which might be prohibitively
> slow on big machines.
>
> - Short living processes will fail to benefit from THP with the policy,
> even with plenty of free memory in the system: no time to promote to THP
> or, with synchronous promotion, cost will overweight the benefit.
>
> The goal to reduce memory overhead of THP is admirable, but we need to be
> careful not to kill THP benefit itself. The approach will reduce number of
> THP mapped in the system and/or shift their allocation to later stage of
> process lifetime.
>
> The only way I see it can be useful is if it will be possible to apply the
> policy on per-VMA basis. It will be very useful for malloc()
> implementations, for instance. But as a global policy it's no-go to me.
I agree that this should not be a global policy. For example, it seems to me
that a VMA where MADV_HUGEPAGE has been applied should get huge
pages on first faults (I need to fix that in my implementation).
>
> Prove me wrong with performance data. :)
I'll try. :-)
Thanks for the comments!
Anthony
Powered by blists - more mailing lists