[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <143deecb-aa53-42e6-b7eb-91fb392e7502@amd.com>
Date: Fri, 5 Dec 2025 15:15:16 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Dan Williams <dan.j.williams@...el.com>, dave.jiang@...el.com
Cc: linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org,
Smita.KoralahalliChannabasappa@....com, alison.schofield@...el.com,
terry.bowman@....com, alejandro.lucero-palau@....com,
linux-pci@...r.kernel.org, Jonathan.Cameron@...wei.com,
Shiju Jose <shiju.jose@...wei.com>
Subject: Re: [PATCH 0/6] cxl: Initialization reworks in support Soft Reserve
Recovery and Accelerator Memory
On 12/4/25 02:21, Dan Williams wrote:
> The CXL subsystem is modular. That modularity is a benefit for
> separation of concerns and testing. It is generally appropriate for this
> class of devices that support hotplug and can dynamically add a CXL
> personality alongside their PCI personality. However, a cost of modules
> is ambiguity about when devices (cxl_memdevs, cxl_ports, cxl_regions)
> have had a chance to attach to their corresponding drivers on
> @cxl_bus_type.
>
> This problem of not being able to reliably determine when a device has
> had a chance to attach to its driver vs still waiting for the module to
> load, is a common problem for the "Soft Reserve Recovery" [1], and
> "Accelerator Memory" [2] enabling efforts.
>
> For "Soft Reserve Recovery" it wants to use wait_for_device_probe() as a
> sync point for when CXL devices present at boot have had a chance to
> attach to the cxl_pci driver (generic CXL memory expansion class
> driver). That breaks down if wait_for_device_probe() only flushes PCI
> device probe, but not the cxl_mem_probe() of the cxl_memdev that
> cxl_pci_probe() creates.
>
> For "Accelerator Memory", the driver is not cxl_pci, but any potential
> PCI driver that wants to use the devm_cxl_add_memdev() ABI to attach to
> the CXL memory domain. Those drivers want to know if the CXL link is
> live end-to-end (from endpoint, through switches, to the host bridge)
> and CXL memory operations are enabled. If not, a CXL accelerator may be
> able to fall back to PCI-only operation. Similar to the "Soft Reserve
> Memory" it needs to know that the CXL subsystem had a chance to probe
> the ancestor topology of the device and let that driver make a
> synchronous decision about CXL operation.
IMO, this is not the problem with accelerators, because this can not be
dynamically done, or not easily. The HW will support CXL or PCI, and if
CXL mem is not enabled by the firmware, likely due to a
negotiation/linking problem, the driver can keep going with CXL.io. Of
course, this is from my experience with sfc driver/hardware. Note sfc
driver added the check for CXL availability based on Terry's v13.
But this is useful for solving the problem of module removal which can
leave the type2 driver without the base for doing any unwinding. Once a
type2 uses code from those other cxl modules explicitly, the problem is
avoided. You seem to have forgotten about this problem, what I think it
is worth to describe.
>
> In support of those efforts:
>
> * Clean up some resource lifetime issues in the current code
> * Move some object creation symbols (devm_cxl_add_memdev() and
> devm_cxl_add_endpoint()) into the cxl_mem.ko and cxl_port.ko objects.
> Implicitly guarantee that cxl_mem_driver and cxl_port_driver have been
> registered prior to any device objects being registered. This is
> preferred over explicit open-coded request_module().
> * Use scoped-based-cleanup before adding more resource management in
> devm_cxl_add_memdev()
> * Give an accelerator the opportunity to run setup operations in
> cxl_mem_probe() so it can further probe if the CXL configuration matches
> its needs.
>
> Some of these previously appeared on a branch as an RFC [3] and left
> "Soft Reserve Recovery" and "Accelerator Memory" to jockey for ordering.
> Instead, create a shared topic branch for both of those efforts to
> import. The main changes since that RFC are fixing a bug and reducing
> the amount of refactoring (which contributed to hiding the bug).
>
> [1]: http://lore.kernel.org/20251120031925.87762-1-Smita.KoralahalliChannabasappa@amd.com
> [2]: http://lore.kernel.org/20251119192236.2527305-1-alejandro.lucero-palau@amd.com
> [3]: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.18/cxl-probe-order
>
> Dan Williams (6):
> cxl/mem: Fix devm_cxl_memdev_edac_release() confusion
> cxl/mem: Arrange for always-synchronous memdev attach
> cxl/port: Arrange for always synchronous endpoint attach
> cxl/mem: Convert devm_cxl_add_memdev() to scope-based-cleanup
> cxl/mem: Drop @host argument to devm_cxl_add_memdev()
> cxl/mem: Introduce a memdev creation ->probe() operation
>
> drivers/cxl/Kconfig | 2 +-
> drivers/cxl/cxl.h | 2 +
> drivers/cxl/cxlmem.h | 17 ++++--
> drivers/cxl/core/edac.c | 64 ++++++++++++---------
> drivers/cxl/core/memdev.c | 104 ++++++++++++++++++++++++-----------
> drivers/cxl/mem.c | 69 +++++++++--------------
> drivers/cxl/pci.c | 2 +-
> drivers/cxl/port.c | 40 ++++++++++++++
> tools/testing/cxl/test/mem.c | 2 +-
> 9 files changed, 192 insertions(+), 110 deletions(-)
>
>
> base-commit: ea5514e300568cbe8f19431c3e424d4791db8291
Powered by blists - more mailing lists