[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Zyo2uvFJKdExcQfH@PC2K9PVX.TheFacebook.com>
Date: Tue, 5 Nov 2024 10:16:10 -0500
From: Gregory Price <gourry@...rry.net>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, david@...hat.com, nphamcs@...il.com,
nehagholkar@...a.com, abhishekd@...a.com,
Johannes Weiner <hannes@...xchg.org>,
Feng Tang <feng.tang@...el.com>
Subject: Re: [PATCH 0/3] mm,TPP: Enable promotion of unmapped pagecache
On Tue, Nov 05, 2024 at 10:00:59AM +0800, Huang, Ying wrote:
> Hi, Gregory,
>
> Gregory Price <gourry@...rry.net> writes:
>
> > My observations between these 3 proposals:
> >
> > - The page-lock state is complex while trying interpose in mark_folio_accessed,
> > meaning inline promotion inside that interface is a non-starter.
> >
> > We found one deadlock during task exit due to the PTL being held.
> >
> > This worries me more generally, but we did find some success changing certain
> > calls to mark_folio_accessed to mark_folio_accessed_and_promote - rather than
> > modifying mark_folio_accessed. This ends up changing code in similar places
> > to your hook - but catches a more conditions that mark a page accessed.
> >
> > - For Keith's proposal, promotions via LRU requires memory pressure on the lower
> > tier to cause a shrink and therefore promotions. I'm not well versed in LRU
> > LRU sematics, but it seems we could try proactive reclaim here.
> >
> > Doing promote-reclaim and demote/swap/evict reclaim on the same triggers
> > seems counter-intuitive.
>
> IIUC, in TPP paper (https://arxiv.org/abs/2206.02878), a similar method
> is proposed for page promoting. I guess that it works together with
> proactive reclaiming.
>
Each process is responsible for doing page table scanning for numa hint faults
and producing a promotion. Since the structure used there is the page tables
themselves, there isn't an existing recording mechanism for us to piggy-back on
to defer migrations to later.
> > - Doing promotions inline with access creates overhead. I've seen some research
> > suggesting 60us+ per migration - so aggressiveness could harm performance.
> >
> > Doing it async would alleviate inline access overheads - but it could also make
> > promotion pointless if time-to-promote is to far from liveliness of the pages.
>
> Async promotion needs to deal with the resource (CPU/memory) charging
> too. You do some work for a task, so you need to charge the consumed
> resource for the task.
>
This is a good point, and would heavily complicate things. Simple is better,
let's avoid that.
> > - Doing async-promotion may also require something like PG_PROMOTABLE (as proposed
> > by Keith's patch), which will obviously be a very contentious topic.
>
> Some additional data structure can be used to record pages.
>
I have an idea inspired by these three sets, i'll bumble my way through a prototype.
> > Reading more into the code surrounding this and other migration logic, I also
> > think we should explore an optimization to mempolicy that tries to aggressively
> > keep certain classes of memory on the local node (RX memory and stack
> > for example).
> >
> > Other areas of reclaim try to actively prevent demoting this type of memory, so we
> > should try not to allocate it there in the first place.
>
> We have already used DRAM first allocation policy. So, we need to
> measure its effect firstly.
>
Yes, but also as the weighted interleave patch set demonstrated, it can be beneficial
to change this to distribute allocations from the outset - however, distributing all
allocations lead to less reliable performance than just distributing the heap.
Another topic for another thread.
~Gregory
Powered by blists - more mailing lists