[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251204022136.2573521-1-dan.j.williams@intel.com>
Date: Wed, 3 Dec 2025 18:21:30 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: dave.jiang@...el.com
Cc: linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org,
Smita.KoralahalliChannabasappa@....com,
alison.schofield@...el.com,
terry.bowman@....com,
alejandro.lucero-palau@....com,
linux-pci@...r.kernel.org,
Jonathan.Cameron@...wei.com,
Alejandro Lucero <alucerop@....com>,
Shiju Jose <shiju.jose@...wei.com>
Subject: [PATCH 0/6] cxl: Initialization reworks in support Soft Reserve Recovery and Accelerator Memory
The CXL subsystem is modular. That modularity is a benefit for
separation of concerns and testing. It is generally appropriate for this
class of devices that support hotplug and can dynamically add a CXL
personality alongside their PCI personality. However, a cost of modules
is ambiguity about when devices (cxl_memdevs, cxl_ports, cxl_regions)
have had a chance to attach to their corresponding drivers on
@cxl_bus_type.
This problem of not being able to reliably determine when a device has
had a chance to attach to its driver vs still waiting for the module to
load, is a common problem for the "Soft Reserve Recovery" [1], and
"Accelerator Memory" [2] enabling efforts.
For "Soft Reserve Recovery" it wants to use wait_for_device_probe() as a
sync point for when CXL devices present at boot have had a chance to
attach to the cxl_pci driver (generic CXL memory expansion class
driver). That breaks down if wait_for_device_probe() only flushes PCI
device probe, but not the cxl_mem_probe() of the cxl_memdev that
cxl_pci_probe() creates.
For "Accelerator Memory", the driver is not cxl_pci, but any potential
PCI driver that wants to use the devm_cxl_add_memdev() ABI to attach to
the CXL memory domain. Those drivers want to know if the CXL link is
live end-to-end (from endpoint, through switches, to the host bridge)
and CXL memory operations are enabled. If not, a CXL accelerator may be
able to fall back to PCI-only operation. Similar to the "Soft Reserve
Memory" it needs to know that the CXL subsystem had a chance to probe
the ancestor topology of the device and let that driver make a
synchronous decision about CXL operation.
In support of those efforts:
* Clean up some resource lifetime issues in the current code
* Move some object creation symbols (devm_cxl_add_memdev() and
devm_cxl_add_endpoint()) into the cxl_mem.ko and cxl_port.ko objects.
Implicitly guarantee that cxl_mem_driver and cxl_port_driver have been
registered prior to any device objects being registered. This is
preferred over explicit open-coded request_module().
* Use scoped-based-cleanup before adding more resource management in
devm_cxl_add_memdev()
* Give an accelerator the opportunity to run setup operations in
cxl_mem_probe() so it can further probe if the CXL configuration matches
its needs.
Some of these previously appeared on a branch as an RFC [3] and left
"Soft Reserve Recovery" and "Accelerator Memory" to jockey for ordering.
Instead, create a shared topic branch for both of those efforts to
import. The main changes since that RFC are fixing a bug and reducing
the amount of refactoring (which contributed to hiding the bug).
[1]: http://lore.kernel.org/20251120031925.87762-1-Smita.KoralahalliChannabasappa@amd.com
[2]: http://lore.kernel.org/20251119192236.2527305-1-alejandro.lucero-palau@amd.com
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.18/cxl-probe-order
Dan Williams (6):
cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
cxl/mem: Arrange for always-synchronous memdev attach
cxl/port: Arrange for always synchronous endpoint attach
cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
cxl/mem: Drop @host argument to devm_cxl_add_memdev()
cxl/mem: Introduce a memdev creation ->probe() operation
drivers/cxl/Kconfig | 2 +-
drivers/cxl/cxl.h | 2 +
drivers/cxl/cxlmem.h | 17 ++++--
drivers/cxl/core/edac.c | 64 ++++++++++++---------
drivers/cxl/core/memdev.c | 104 ++++++++++++++++++++++++-----------
drivers/cxl/mem.c | 69 +++++++++--------------
drivers/cxl/pci.c | 2 +-
drivers/cxl/port.c | 40 ++++++++++++++
tools/testing/cxl/test/mem.c | 2 +-
9 files changed, 192 insertions(+), 110 deletions(-)
base-commit: ea5514e300568cbe8f19431c3e424d4791db8291
--
2.51.1
Powered by blists - more mailing lists