[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPJPnZ01Gzi533v4@gourry-fedora-PF4VCD3F>
Date: Fri, 17 Oct 2025 10:15:57 -0400
From: Gregory Price <gourry@...rry.net>
To: Yiannis Nikolakopoulos <yiannis.nikolakop@...il.com>
Cc: Jonathan Cameron <jonathan.cameron@...wei.com>,
Wei Xu <weixugc@...gle.com>, David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Bharata B Rao <bharata@....com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, dave.hansen@...el.com, hannes@...xchg.org,
mgorman@...hsingularity.net, mingo@...hat.com, peterz@...radead.org,
raghavendra.kt@....com, riel@...riel.com, sj@...nel.org,
ying.huang@...ux.alibaba.com, ziy@...dia.com, dave@...olabs.net,
nifan.cxl@...il.com, xuezhengchu@...wei.com,
akpm@...ux-foundation.org, david@...hat.com, byungchul@...com,
kinseyho@...gle.com, joshua.hahnjy@...il.com, yuanchu@...gle.com,
balbirs@...dia.com, alok.rathore@...sung.com, yiannis@...corp.com,
Adam Manzanares <a.manzanares@...sung.com>
Subject: Re: [RFC PATCH v2 0/8] mm: Hot page tracking and promotion
infrastructure
On Fri, Oct 17, 2025 at 11:53:31AM +0200, Yiannis Nikolakopoulos wrote:
> On Wed, Oct 1, 2025 at 9:22 AM Gregory Price <gourry@...rry.net> wrote:
> > 1. Carve out an explicit proximity domain (NUMA node) for the compressed
> > region via SRAT.
> > https://docs.kernel.org/driver-api/cxl/platform/acpi/srat.html
> >
> > 2. Make sure this proximity domain (NUMA node) has separate data in the
> > HMAT so it can be an explicit demotion target for higher tiers
> > https://docs.kernel.org/driver-api/cxl/platform/acpi/hmat.html
> This makes sense. I've done a dirty hardcoding trick in my prototype
> so that my node is always the last target. I'll have a look on how to
> make this right.
I think it's probably a CEDT/CDAT/HMAT/SRAT/etc negotiation.
Essentially the platform needs to allow a single device to expose
multiple numa nodes based on different expected performance. From
those ranges. Then software needs to program the HDM decoders
appropriately.
> > 5. in `alloc_migration_target()` mm/migrate.c
> > Since nid is not a valid buddy-allocator target, everything here
> > will fail. So we can simply append the following to the bottom
> >
> > device_folio_alloc = nid_to_alloc(nid, DEVICE_FOLIO_ALLOC);
> > if (device_folio_alloc)
> > folio = device_folio_alloc(...)
> > return folio;
> In my current prototype alloc_migration_target was working (naively).
> Steps 3, 4 and 5 seem like an interesting thing to try after all this
> discussion.
> >
Right because the memory is directly accessible to the buddy allocator.
What i'm proposing would remove this memory from the buddy allocator and
force more explicit integration (in this case with this function).
more explicitly: in this design __folio_alloc can never access this
memory.
~Gregory
Powered by blists - more mailing lists