[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad5e9cdc-9bdd-4824-9c11-171bfcc39b38@amd.com>
Date: Mon, 26 May 2025 10:50:02 +0530
From: Bharata B Rao <bharata@....com>
To: SeongJae Park <sj@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Jonathan.Cameron@...wei.com, dave.hansen@...el.com, gourry@...rry.net,
hannes@...xchg.org, mgorman@...hsingularity.net, mingo@...hat.com,
peterz@...radead.org, raghavendra.kt@....com, riel@...riel.com,
rientjes@...gle.com, weixugc@...gle.com, willy@...radead.org,
ying.huang@...ux.alibaba.com, ziy@...dia.com, dave@...olabs.net,
nifan.cxl@...il.com, joshua.hahnjy@...il.com, xuezhengchu@...wei.com,
yiannis@...corp.com, akpm@...ux-foundation.org, david@...hat.com
Subject: Re: [RFC PATCH v0 0/2] Batch migration for NUMA balancing
Hi SJ,
On 22-May-25 12:15 AM, SeongJae Park wrote:
> Hi Bharata,
>
> On Wed, 21 May 2025 13:32:36 +0530 Bharata B Rao <bharata@....com> wrote:
>
>> Hi,
>>
>> This is an attempt to convert the NUMA balancing to do batched
>> migration instead of migrating one folio at a time. The basic
>> idea is to collect (from hint fault handler) the folios to be
>> migrated in a list and batch-migrate them from task_work context.
>> More details about the specifics are present in patch 2/2.
>>
>> During LSFMM[1] and subsequent discussions in MM alignment calls[2],
>> it was suggested that separate migration threads to handle migration
>> or promotion request may be desirable. Existing NUMA balancing, hot
>> page promotion and other future promotion techniques could off-load
>> migration part to these threads. Or if we manage to have a single
>> source of hotness truth like kpromoted[3], then that too can hand
>> over migration requests to the migration threads. I am envisaging
>> that different hotness sources like kmmscand[4], MGLRU[5], IBS[6]
>> and CXL HMU would push hot page info to kpromoted, which would
>> then isolate and push the folios to be promoted to the migrator
>> thread.
>
> I think (or, hope) it would also be not very worthless or rude to mention other
> existing and ongoing works that have potentials to serve for similar purpose or
> collaborate in future, here.
>
> DAMON is designed for a sort of multi-source access information handling. In
> LSFMM, I proposed[1] damon_report_access() interface for making it easier to be
> extended for more types of access information. Currenlty damon_report_access()
> is under early development. I think this has a potential to serve something
> similar to your single source goal.
>
> Also in LSFMM, I proposed damos_add_folio() for a case that callers want to
> utilize DAMON worker thread (kdamond) as an asynchronous memory
> management operations execution thread while using its other features such as
> [auto-tuned] quotas. I think this has a potential to serve something similar
> to your migration threads. I haven't started damos_add_folio() development
> yet, though.
>
> I remember we discussed about DAMON on mailing list and in LSFMM a bit, on your
> session. IIRC, you were also looking for a time to see if there is a chance to
> use DAMON in some way. Due to the technical issue, we were unable to discuss
> on the two new proposals on my LSFMM session, and it has been a bit while since
> our last discussion. So if you don't mind, I'd like to ask if you have some
> opinions or comments about these.
>
> [1] https://lwn.net/Articles/1016525/
Since this patchset was just about making the migration batched and
async for NUMAB, I didn't mention DAMON as an alternative here.
One of the concerns I always had about DAMON when it is considered as
replacement for existing hot page migration is its current inability to
gather and maintain hot page info at per-folio granularity. How much
that eventually matters to the workloads has to be really seen.
Regards,
Bharata.
Regards,
Bharata.
Powered by blists - more mailing lists