[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260206110130.00005fc2.alireza.sanaee@huawei.com>
Date: Fri, 6 Feb 2026 11:01:30 +0000
From: Alireza Sanaee <alireza.sanaee@...wei.com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
CC: Gregory Price <gourry@...rry.net>, Ira Weiny <ira.weiny@...el.com>, "Dave
Jiang" <dave.jiang@...el.com>, Fan Ni <fan.ni@...sung.com>, Dan Williams
<dan.j.williams@...el.com>, Davidlohr Bueso <dave@...olabs.net>, "Alison
Schofield" <alison.schofield@...el.com>, Vishal Verma
<vishal.l.verma@...el.com>, <linux-cxl@...r.kernel.org>,
<nvdimm@...ts.linux.dev>, <linux-kernel@...r.kernel.org>, Li Ming
<ming.li@...omail.com>
Subject: Re: [PATCH v9 00/19] DCD: Add support for Dynamic Capacity Devices
(DCD)
On Thu, 5 Feb 2026 17:48:47 +0000
Jonathan Cameron <jonathan.cameron@...wei.com> wrote:
Hi Jonathan,
Thanks for the clarifications.
Quick thought inline.
> > > I'm not clear if sysram could be used for virtio, or even needed. I'm
> > > still figuring out how virtio of simple memory devices is a gain.
> > >
> >
> > Jonathan mentioned that he thinks it would be possible to just bring it
> > online as a private-node and inform the consumer of this. I think
> > that's probably reasonable.
>
> Firstly VM == Application. If we have say a DB that wants to do everything
> itself, it would use same interface as a VM to get the whole memory
> on offer. (I'm still trying to get that Application Specific Memory term
> adopted ;)
>
> This would be better if we didn't assume anything to do with virtio
> - that's just one option (and right now for CXL mem probably not the
> sensible one as it's missing too many things we get for free by just
> emulating CXL devices - e.g. all the stuff you are describing here
> for the host is just as valid in the guest.) We have a path to
> get that emulation and should have the big missing piece posted shortly
> (DCD backed by 'things - this discussion' that turn up after VM boot).
>
> The real topic is memory for a VM and we need a way to tie a memory
> backend in qemu to, so that whatever the fabric manager provided for
> that VM is given to the VM and not used for anything else.
>
> If it's for a specific VM, then it's tagged as otherwise how else
> do we know the intent? (lets ignore random other out of band paths).
>
> Layering wise we can surface as many backing sources as we like at
> runtime via 1+ emulated DCD devices (to give perf information etc).
> They each show up in the guest as contiguous (maybe tagged) single
> extent and then we apply what ever comes out of the rest of this
> discussion on top of that.
>
> So all we care about is how the host presents it.
>
> Bunch of things might work for this.
>
> 1. Just put it in a numa node that requires specific selection to allocate
> from. This is nice because it just looks like normal memory and we
> can apply any type of front end on top of that. Not good if we have a lot
> of these coming and going.
>
> 2. Provide it as something with an fd we can memmap. I was fine with Dax for
> this but if it's normal ram just for a VM anything that gives me a handle
> that I can memmap is fine. Just need a way to know which one (so tag).
I think both of these approaches are OK, but looking from developers
perspective, if someone wants a specific memory for their workload, they
should rather get a fd and play with it in whichever way they want. NUMA may
not give that much flexibility. As a developer it would prefer 2. Though you
may say oh dax then? not sure!
>
> It's pretty similar for shared cases. Just need a handle to memmap.
> In that case, tag goes straight up to guest OS (we've just unwound the
> extent ordering in the host and presented it as a contiguous single
> extent).
>
> Assumption here is we always provide all that capacity that was tagged
> for the VM to use to the VM. Things may get more entertaining if we have
> a bunch of capacity that was tagged to provide extra space for a set of
> VMs (e.g. we overcommit on top of the DCD extents) - to me that's a
> job for another day.
>
> So I'm not really envisioning anything special for the VM case, it's
> just a dedicate allocation of memory for a user who knows how to get it.
> We will want a way to get perf info though so we can provide that
> in the VM. Maybe can figure that out from the CXL HW backing it without
> needing anything special in what is being discussed here.
>
> Jonathan
>
> >
> > ~Gregory
>
Powered by blists - more mailing lists