[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <609a02bb-3271-4021-9499-8b281a959f62@amd.com>
Date: Thu, 13 Feb 2025 09:43:40 -0600
From: "Bowman, Terry" <terry.bowman@....com>
To: Dan Williams <dan.j.williams@...el.com>, linux-cxl@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
nifan.cxl@...il.com, dave@...olabs.net, jonathan.cameron@...wei.com,
dave.jiang@...el.com, alison.schofield@...el.com, vishal.l.verma@...el.com,
bhelgaas@...gle.com, mahesh@...ux.ibm.com, ira.weiny@...el.com,
oohall@...il.com, Benjamin.Cheatham@....com, rrichter@....com,
nathan.fontenot@....com, Smita.KoralahalliChannabasappa@....com,
lukas@...ner.de, ming.li@...omail.com, PradeepVineshReddy.Kodamati@....com
Subject: Re: [PATCH v7 07/17] cxl/pci: Map CXL PCIe Root Port and Downstream
Switch Port RAS registers
On 2/11/2025 7:23 PM, Dan Williams wrote:
> Terry Bowman wrote:
>> The CXL mem driver (cxl_mem) currently maps and caches a pointer to RAS
>> registers for the endpoint's Root Port. The same needs to be done for
>> each of the CXL Downstream Switch Ports and CXL Root Ports found between
>> the endpoint and CXL Host Bridge.
>>
>> Introduce cxl_init_ep_ports_aer() to be called for each CXL Port in the
>> sub-topology between the endpoint and the CXL Host Bridge. This function
>> will determine if there are CXL Downstream Switch Ports or CXL Root Ports
>> associated with this Port. The same check will be added in the future for
>> upstream switch ports.
>>
>> Move the RAS register map logic from cxl_dport_map_ras() into
>> cxl_dport_init_ras_reporting(). This eliminates the need for the helper
>> function, cxl_dport_map_ras().
> Not sure about the motivation here...
>
>> cxl_init_ep_ports_aer() calls cxl_dport_init_ras_reporting() to map
>> the RAS registers for CXL Downstream Switch Ports and CXL Root Ports.
> Ok, makes sense...
>
>> cxl_dport_init_ras_reporting() must check for previously mapped registers
>> before mapping. This is required because multiple Endpoints under a CXL
>> switch may share an upstream CXL Root Port, CXL Downstream Switch Port,
>> or CXL Downstream Switch Port. Ensure the RAS registers are only mapped
>> once.
> Sounds broken. Every device upstream-port only has one downstream port.
>
> A CXL switch config looks like this:
>
> │
> ┌──────────┼────────────┐
> │SWITCH ┌┴─┐ │
> │ │UP│ │
> │ └─┬┘ │
> │ ┌──────┼─────┐ │
> │ │ │ │ │
> │ ┌┴─┐ ┌─┴┐ ┌─┴┐ │
> │ │DP│ │DP│ │DP│ │
> │ └┬─┘ └─┬┘ └─┬┘ │
> └────┼──────┼─────┼─────┘
> ┌┴─┐ ┌─┴┐ ┌─┴┐
> │EP│ │EP│ │EP│
> └──┘ └──┘ └──┘
>
> ...so how can an endpoint ever find that its immediate parent downstream
> port has already been mapped?
┌─┴─┐
│RP1│
└─┬─┘
┌───────────┼───────────┐
│SWITCH ┌─┴─┐ │
│ │UP1│ │ RP1 - 0c:00.0
│ └─┬─┘ │ UP1 - 0d:00.0
│ ┌──────┼─────┐ │ DP1 - 0e:00.0
│ │ │ │ │
│ ┌─┴─┐ ┌─┴─┐ ┌─┴─┐ │
│ │DP1│ │DP2│ │DP3│ │
│ └─┬─┘ └─┬─┘ └─┬─┘ │
└────┼──────┼─────┼─────┘
┌─┴─┐ ┌─┴─┐ ┌─┴─┐
│EP1│ │EP2│ │EP3│
└───┘ └───┘ └───┘
It cant but the root RP and USP have duplicate calls for each EP in the example diagram.
The function's purpose is to map RAS registers and cache the address. This reuses the
same function for RP and DSP. The DSP will never be previously mapped as you indicated.
>> Introduce a mutex for synchronizing accesses to the cached RAS mapping.
> I suspect the motivation for the lock and "previously mapped" check was
> due to noticing that the ras registers are not being unmapped, but
> that's due to a devm bug below.
The synchronization was added as result of review recommendation because it is
a racy area. It's possible that multiple endpoints using the same switch could
call this function from devm_cxl_add_endpoints()->cxl_init_ep_ports().
> Even if it were the case that multiple resources need to share 1 devm
> mapping, that would need to look something like the logic around
> cxl_detach_ep(). In that arrangement, the first endpoint in the door
> sets up the 'struct cxl_port' and its 'struct cxl_dport' instances, and
> the last endpoint out the door tears it all down and turns off the
> lights.
>
>> Signed-off-by: Terry Bowman <terry.bowman@....com>
>> Reviewed-by: Alejandro Lucero <alucerop@....com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>> Reviewed-by: Gregory Price <gourry@...rry.net>
>> ---
>> drivers/cxl/core/pci.c | 42 ++++++++++++++++++++----------------------
>> drivers/cxl/cxl.h | 6 ++----
>> drivers/cxl/mem.c | 31 +++++++++++++++++++++++++++++--
>> 3 files changed, 51 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index a5c65f79db18..143c853a52c4 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -24,6 +24,8 @@ static unsigned short media_ready_timeout = 60;
>> module_param(media_ready_timeout, ushort, 0644);
>> MODULE_PARM_DESC(media_ready_timeout, "seconds to wait for media ready");
>>
>> +static DEFINE_MUTEX(ras_init_mutex);
>> +
>> struct cxl_walk_context {
>> struct pci_bus *bus;
>> struct cxl_port *port;
>> @@ -749,18 +751,6 @@ static void cxl_dport_map_rch_aer(struct cxl_dport *dport)
>> }
>> }
>>
>> -static void cxl_dport_map_ras(struct cxl_dport *dport)
>> -{
>> - struct cxl_register_map *map = &dport->reg_map;
>> - struct device *dev = dport->dport_dev;
>> -
>> - if (!map->component_map.ras.valid)
>> - dev_dbg(dev, "RAS registers not found\n");
>> - else if (cxl_map_component_regs(map, &dport->regs.component,
>> - BIT(CXL_CM_CAP_CAP_ID_RAS)))
>> - dev_dbg(dev, "Failed to map RAS capability.\n");
>> -}
>> -
>> static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
>> {
>> void __iomem *aer_base = dport->regs.dport_aer;
>> @@ -788,22 +778,30 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
>> /**
>> * cxl_dport_init_ras_reporting - Setup CXL RAS report on this dport
>> * @dport: the cxl_dport that needs to be initialized
>> - * @host: host device for devm operations
>> */
>> -void cxl_dport_init_ras_reporting(struct cxl_dport *dport, struct device *host)
>> +void cxl_dport_init_ras_reporting(struct cxl_dport *dport)
>> {
>> - dport->reg_map.host = host;
>> - cxl_dport_map_ras(dport);
>> -
>> - if (dport->rch) {
>> - struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport->dport_dev);
>> -
>> - if (!host_bridge->native_aer)
>> - return;
>> + struct device *dport_dev = dport->dport_dev;
>> + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev);
>>
>> + dport->reg_map.host = dport_dev;
> This seems to be confused about how devm works. @host is passed in
> because the cxl_memdev instance being probed in cxl_mem_probe() is doing
> setup work on behalf of @dport_dev.
>
> When the cxl_memdev goes through a ->remove() event, unbind from
> cxl_mem, it tears down that mapping.
>
> However, when using @dport_dev as the devm host, that mapping will not
> be torn down until either the @dport_dev goes through a ->remove() event
> or the device is unregistered altogether. There is no CXL subsystem
> coordination with a driver for @dport_dev. The PCIe portdrv might have
> an interest in it, but CXL can not depend on portdrv to map CXL
> registers or keep the device bound while CXL has an interest those
> registers. The devres_release_all() triggered by a
> "device_del(@dport_dev)" is also uncoordinated with any CXL interest. In
> general, it is a devm anti-pattern to depend on a device_del() event to
> trigger devres_release_all().
>
>
>> + if (dport->rch && host_bridge->native_aer) {
>> cxl_dport_map_rch_aer(dport);
>> cxl_disable_rch_root_ints(dport);
>> }
>> +
>> + /* dport may have more than 1 downstream EP. Check if already mapped. */
>> + mutex_lock(&ras_init_mutex);
> I suspect this lock and check got added to workaround "Failed to request
> region" messages coming out of devm_cxl_iomap_block() in testing? Per
> above, that's not "more than 1 downstream EPi", that's "failure to clean
> up the last mapping for the next cxl_mem_probe() event of the same
> endpoint".
Synchronization was added to handle the concurrent accesses. I never observed
issues due to the race condition for RP and USP but I confirmed through further
testing it is a real potential issue for the RP and USP.
You recommended, in the next patch, to map USP RAS registers from cxl_endpoint_port_probe().
Would you like the RP and DSP mapping to be called from cxl_endpoint_port_probe() as well? Terry
Powered by blists - more mailing lists