[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30d7f613-4089-4e64-893a-83ebf2e319c1@intel.com>
Date: Fri, 27 Jun 2025 11:17:24 -0700
From: Dave Jiang <dave.jiang@...el.com>
To: alejandro.lucero-palau@....com, linux-cxl@...r.kernel.org,
netdev@...r.kernel.org, dan.j.williams@...el.com, edward.cree@....com,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, edumazet@...gle.com
Cc: Alejandro Lucero <alucerop@....com>
Subject: Re: [PATCH v17 10/22] cx/memdev: Indicate probe deferral
On 6/24/25 7:13 AM, alejandro.lucero-palau@....com wrote:
> From: Alejandro Lucero <alucerop@....com>
>
> The first step for a CXL accelerator driver that wants to establish new
> CXL.mem regions is to register a 'struct cxl_memdev'. That kicks off
> cxl_mem_probe() to enumerate all 'struct cxl_port' instances in the
> topology up to the root.
>
> If the port driver has not attached yet the expectation is that the
> driver waits until that link is established. The common cxl_pci driver
> has reason to keep the 'struct cxl_memdev' device attached to the bus
> until the root driver attaches. An accelerator may want to instead defer
> probing until CXL resources can be acquired.
>
> Use the @endpoint attribute of a 'struct cxl_memdev' to convey when a
> accelerator driver probing should be deferred vs failed. Provide that
> indication via a new cxl_acquire_endpoint() API that can retrieve the
> probe status of the memdev.
>
> Signed-off-by: Alejandro Lucero <alucerop@....com>
> ---
> drivers/cxl/core/memdev.c | 42 +++++++++++++++++++++++++++++++++++++++
> drivers/cxl/core/port.c | 2 +-
> drivers/cxl/mem.c | 7 +++++--
> include/cxl/cxl.h | 2 ++
> 4 files changed, 50 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index f43d2aa2928e..e2c6b5b532db 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -1124,6 +1124,48 @@ struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> }
> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>
> +/*
> + * Try to get a locked reference on a memdev's CXL port topology
> + * connection. Be careful to observe when cxl_mem_probe() has deposited
> + * a probe deferral awaiting the arrival of the CXL root driver.
> + */
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd)
Annotation of __acquires() is needed here to annotate that this function is taking multiple locks and keeping the locks.
> +{
> + struct cxl_port *endpoint;
> + int rc = -ENXIO;
> +
> + device_lock(&cxlmd->dev);
> +> + endpoint = cxlmd->endpoint;
> + if (!endpoint)
> + goto err;
> +
> + if (IS_ERR(endpoint)) {
> + rc = PTR_ERR(endpoint);
> + goto err;
> + }
> +
> + device_lock(&endpoint->dev);
> + if (!endpoint->dev.driver)> + goto err_endpoint;
> +
> + return endpoint;
> +
> +err_endpoint:
> + device_unlock(&endpoint->dev);
> +err:
> + device_unlock(&cxlmd->dev);
> + return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_acquire_endpoint, "CXL");
> +
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint)
And __releases() here to release the lock annotations
> +{
> + device_unlock(&endpoint->dev);
> + device_unlock(&cxlmd->dev);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_release_endpoint, "CXL");
> +
> static void sanitize_teardown_notifier(void *data)
> {
> struct cxl_memdev_state *mds = data;
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 9acf8c7afb6b..fa10a1643e4c 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1563,7 +1563,7 @@ static int add_port_attach_ep(struct cxl_memdev *cxlmd,
> */
> dev_dbg(&cxlmd->dev, "%s is a root dport\n",
> dev_name(dport_dev));
> - return -ENXIO;
> + return -EPROBE_DEFER;
> }
>
> struct cxl_port *parent_port __free(put_cxl_port) =
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 7f39790d9d98..cda0b2ff73ce 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -148,14 +148,17 @@ static int cxl_mem_probe(struct device *dev)
> return rc;
>
> rc = devm_cxl_enumerate_ports(cxlmd);
> - if (rc)
> + if (rc) {
> + cxlmd->endpoint = ERR_PTR(rc);
> return rc;
> + }
>
> struct cxl_port *parent_port __free(put_cxl_port) =
> cxl_mem_find_port(cxlmd, &dport);
> if (!parent_port) {
> dev_err(dev, "CXL port topology not found\n");
> - return -ENXIO;
> + cxlmd->endpoint = ERR_PTR(-EPROBE_DEFER);
kdoc to 'struct cxl_memdev' will be needed to explain this change of expectation for the endpoint member.
> + return -EPROBE_DEFER;
Can you please explain how the accelerator driver init path is different in this instance that it requires cxl_mem driver to defer probing? Currently with a type3, the cxl_acpi driver will setup the CXL root, hostbridges and PCI root ports. At that point the memdev driver will enumerate the rest of the ports and attempt to establish the hierarchy. However if cxl_acpi is not done, the mem probe will fail. But, the cxl_acpi probe will trigger a re-probe sequence at the end when it is done. At that point, the mem probe should discover all the necessary ports if things are correct. If the accelerator init path is different, can we introduce some documentation to explain the difference?
Also, it seems as long as port topology is not found, it will always go to deferred probing. At what point do we conclude that things may be missing/broken and we need to fail?
DJ
> }
>
> if (cxl_pmem_size(cxlds) && IS_ENABLED(CONFIG_CXL_PMEM)) {
> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
> index fcdf98231ffb..2928e16a62e2 100644
> --- a/include/cxl/cxl.h
> +++ b/include/cxl/cxl.h
> @@ -234,4 +234,6 @@ int cxl_map_component_regs(const struct cxl_register_map *map,
> void cxl_set_capacity(struct cxl_dev_state *cxlds, u64 capacity);
> struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
> struct cxl_dev_state *cxlmds);
> +struct cxl_port *cxl_acquire_endpoint(struct cxl_memdev *cxlmd);
> +void cxl_release_endpoint(struct cxl_memdev *cxlmd, struct cxl_port *endpoint);
> #endif /* __CXL_CXL_H__ */
Powered by blists - more mailing lists