[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXJQGlRr7WrLyiU-@gourry-fedora-PF4VCD3F>
Date: Thu, 22 Jan 2026 11:28:10 -0500
From: Gregory Price <gourry@...rry.net>
To: linux-cxl@...r.kernel.org
Cc: dan.j.williams@...el.com, dave.jiang@...el.com,
jonathan.cameron@...wei.com, alison.schofield@...el.com,
ira.weiny@...el.com, dave@...olabs.net,
linux-kernel@...r.kernel.org, kernel-team@...a.com,
vishal.l.verma@...el.com, david@...nel.org,
benjamin.cheatham@....com
Subject: Re: cxl/region.c improvements and DAX/Hotplug plumbing
On Wed, Jan 21, 2026 at 02:38:48PM -0500, Gregory Price wrote:
> --------------------------------
> Problem: Per-region usage policy
> --------------------------------
... snip ...
>
> ABI: (RW) regionN/region_driver
> Read: Displays what region driver is assigned
> Write: Changing an uncommitted region's underlying driver
>
> ABI: regionN/rctl/*
> Exposes region_driver specific controls / information
> example: auto-online policy for sysram_region
>
Referencing this thread and questions from Ira:
https://lore.kernel.org/linux-cxl/aXGHgtAHNVWJsZbo@gourry-fedora-PF4VCD3F/
the appropriate design here is likely breaking out new drivers with
their own bind functions and leaving cxl/drivers/region/bind to be the
auto-decoder / compat interface.
so this turns into:
ABI:
cxl/drivers/dax_region/bind
cxl/drivers/pmem_region/bind
cxl/drivers/sysram_region/bind
Private regions likely remain internal interaces for device drivers
(there's no way for userland to configure a set of callbacks)
(Note for Dan: Each probe function here can determine which
PARTMODE's are valid for that driver - so we can prevent
pmem from ever using non-pmem drivers)
This implies some changes to dax_region to at least not immediately
probe the dax device on creation so that policy can be programmed.
(see: dax_bus_probe() - unconditionally probes at discovery)
So laying out the current workflow:
CURRENT: region/bind auto-decoder / compat workflow
----------------------------------------------------
echo region0 > decoder0.0/create_ram_region
=> creates region0
/* program region (decoders, targets */
echo region0 > cxl/drivers/region/bind
=> probe() creates dax_region
=> dax_region creates, configures, and registers dev_dax
=> dax_bus_probe discovers dev_dax and selects a driver
=> IORESOURCE_DAX_KMEM = dax_kmem if KMEM built in
=> otherwise device_dax
=> dev_dax probe happens automatically from bus_probe()
=> if device_dax driver, make /dev/dax/* and stop
=> if kmem driver, engage memory-hotplug.c
=> system default hotplug policy is applied
All of this basically just happens automagically
----------------------------------------------------
The new workflows for manually created/programmed regions:
Manual dax_region Workflow
------------------------------------
echo region0 > decoder0.0/create_ram_region
=> creates region0
/* program region0 (decoders, targets) */
echo region0 > cxl/drivers/dax_region/bind
=> creates dax_region
=> selects device_dax driver
=> creates unprobed dev_dax
/* program dax_region controls */
echo daxN.M > dax/drivers/device_dax/bind
=> probes daxN.M
=> creates /dev/dax/ file
------------------------------------
Manual sysram_region Workflow
---------------------------------------
echo region0 > decoder0.0/create_ram_region
=> creates region0
/* program region0 (decoders, targets) */
echo region0 > cxl/drivers/sysram_region/bind
=> creates sysram_region which
=> creates dax_region
=> creates unprobed dev_dax
=> selects dev_kmem driver for dev_dax
/* program hotplug policy */
echo online_movable > sysram_region/hotplug
=> dax_region.hotplug_mode = MMOP_ONLINE_MOVABLE
echo daxN.M > dax/drivers/kmem/bind
=> probe daxN.M
=> add_memory_driver_managed(..., dax_region.hotplug_mod);
---------------------------------------
And for dynamic capacity regions, you can use these same drivers, it
just changes the default behavior of [dax,sysram]_region when probed.
Manual dc_region workflow
---------------------------------------
echo region0 > decoder0.0/create_dc_region
=> creates region0
/* program region0 (decoders, targets) */
echo region0 > cxl/drivers/[dax, sysram]/bind
=> creates xxx_region WITHOUT dev_dax
/* At this point, the extent discovery process takes over */
extent set arrives:
=> dc code calls `[dax,sysram]_region.add_extents(tag, extents)`
=> dax_region -> create new device_dax w/ set
=> sysram_region -> create new dax_kmem w/ set
---------------------------------------
cxl-cli
---------------------------------------
What this should look like to cxl-cli is something like:
cxl create-region -t [pmem,ram,dc] --driver=[pmem,dax,sysram,] ...
/* if not --driver ... -c --controller ? */
And since `[dax,sysram,...]_region/bind` will restrict which PARTMODE is
valid (pmem=>[pmem], ram=>[dax, sysram], dc=>[dax, sysram, ...]),
We have a clean failure condition that lets us undo the probe process
above if the user selects a bad combination.
cxl create-region -t pmem --driver=sysram ... => -ENOSUPP
----------------------------------------
I think this detangles most of region.c probe/policy issues.
>From here most everything else I describe can be implemented in the
relevant driver directory. (replaces region0/rctl/ in original email)
region0/sysram_region/* - policy controls
region0/sysram_region/dax_region/* - account daxN.M's
region0/sysram_region/dax_region/daxN.M/hotplug - atomic hotplug
~Gregory
Powered by blists - more mailing lists