[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aSDUl7kU73LJR78g@gourry-fedora-PF4VCD3F>
Date: Fri, 21 Nov 2025 16:07:35 -0500
From: Gregory Price <gourry@...rry.net>
To: Alistair Popple <apopple@...dia.com>
Cc: linux-mm@...ck.org, kernel-team@...a.com, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-fsdevel@...r.kernel.org, cgroups@...r.kernel.org,
dave@...olabs.net, jonathan.cameron@...wei.com,
dave.jiang@...el.com, alison.schofield@...el.com,
vishal.l.verma@...el.com, ira.weiny@...el.com,
dan.j.williams@...el.com, longman@...hat.com,
akpm@...ux-foundation.org, david@...hat.com,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz,
rppt@...nel.org, surenb@...gle.com, mhocko@...e.com,
osalvador@...e.de, ziy@...dia.com, matthew.brost@...el.com,
joshua.hahnjy@...il.com, rakie.kim@...com, byungchul@...com,
ying.huang@...ux.alibaba.com, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, tj@...nel.org, hannes@...xchg.org,
mkoutny@...e.com, kees@...nel.org, muchun.song@...ux.dev,
roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
rientjes@...gle.com, jackmanb@...gle.com, cl@...two.org,
harry.yoo@...cle.com, axelrasmussen@...gle.com, yuanchu@...gle.com,
weixugc@...gle.com, zhengqi.arch@...edance.com,
yosry.ahmed@...ux.dev, nphamcs@...il.com, chengming.zhou@...ux.dev,
fabio.m.de.francesco@...ux.intel.com, rrichter@....com,
ming.li@...omail.com, usamaarif642@...il.com, brauner@...nel.org,
oleg@...hat.com, namcao@...utronix.de, escape@...ux.alibaba.com,
dongjoo.seo1@...sung.com
Subject: Re: [RFC LPC2026 PATCH v2 00/11] Specific Purpose Memory NUMA Nodes
On Tue, Nov 18, 2025 at 06:02:02PM +1100, Alistair Popple wrote:
>
> I'm interested in the contrast with zone_device, and in particular why
> device_coherent memory doesn't end up being a good fit for this.
>
> > - Why mempolicy.c and cpusets as-is are insufficient
> > - SPM types seeking this form of interface (Accelerator, Compression)
>
> I'm sure you can guess my interest is in GPUs which also have memory some people
> consider should only be used for specific purposes :-) Currently our coherent
> GPUs online this as a normal NUMA noode, for which we have also generally
> found mempolicy, cpusets, etc. inadequate as well, so it will be interesting to
> hear what short comings you have been running into (I'm less familiar with the
> Compression cases you talk about here though).
>
after some thought, talks, and doc readings it seems like the
zone_device setups don't allow the CPU to map the devmem into page
tables, and instead depends on migrate_device logic (unless the docs are
out of sync with the code these days). That's at least what's described
in hmm and migrate_device.
Assuming this is out of date and ZONE_DEVICE memory is mappable into
page tables, assuming you want sparse allocation, ZONE_DEVICE seems to
suggest you at least have to re-implement the buddy logic (which isn't
that tall of an ask).
But I could imagine an (overly simplistic) pattern with SPM Nodes:
fd = open("/dev/gpu_mem", ...)
buf = mmap(fd, ...)
buf[0]
1) driver takes the fault
2) driver calls alloc_page(..., gpu_node, GFP_SPM_NODE)
3) driver manages any special page table masks
Like marking pages RO/RW to manage ownership.
4) driver sends the gpu the (mapping_id, pfn, index) information
so that gpu can map the region in its page tables.
5) since the memory is cache coherent, gpu and cpu are free to
operate directly on the pages without any additional magic
(except typical concurrency controls).
Driver doesn't have to do much in the way of allocationg management.
This is probably less compelling since you don't want general purposes
services like reclaim, migration, compaction, tiering - etc.
The value is clearly that you get to manage GPU memory like any other
memory, but without worry that other parts of the system will touch it.
I'm much more focused on the "I have memory that is otherwise general
purpose, and wants services like reclaim and compaction, but I want
strong controls over how things can land there in the first place".
~Gregory
Powered by blists - more mailing lists