[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2413985e-ef5f-48c2-a2d1-1cce91965752@amd.com>
Date: Tue, 9 Dec 2025 09:17:15 -0600
From: "Bowman, Terry" <terry.bowman@....com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
alison.schofield@...el.com, dan.j.williams@...el.com, bhelgaas@...gle.com,
shiju.jose@...wei.com, ming.li@...omail.com,
Smita.KoralahalliChannabasappa@....com, rrichter@....com,
dan.carpenter@...aro.org, PradeepVineshReddy.Kodamati@....com,
lukas@...ner.de, Benjamin.Cheatham@....com,
sathyanarayanan.kuppuswamy@...ux.intel.com, linux-cxl@...r.kernel.org,
alucerop@....com, ira.weiny@...el.com, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org
Subject: Re: [PATCH v13 20/25] CXL/PCI: Introduce CXL Port protocol error
handlers
On 12/8/2025 12:37 PM, Bjorn Helgaas wrote:
> On Mon, Nov 03, 2025 at 06:09:56PM -0600, Terry Bowman wrote:
>> Add CXL protocol error handlers for CXL Port devices (Root Ports,
>> Downstream Ports, and Upstream Ports). Implement cxl_port_cor_error_detected()
>> and cxl_port_error_detected() to handle correctable and uncorrectable errors
>> respectively.
>>
>> Introduce cxl_get_ras_base() to retrieve the cached RAS register base
>> address for a given CXL port. This function supports CXL Root Ports,
>> Downstream Ports, and Upstream Ports by returning their previously mapped
>> RAS register addresses.
>>
>> Add device lock assertions to protect against concurrent device or RAS
>> register removal during error handling. The port error handlers require
>> two device locks:
>>
>> 1. The port's CXL parent device - RAS registers are mapped using devm_*
>> functions with the parent port as the host. Locking the parent prevents
>> the RAS registers from being unmapped during error handling.
>>
>> 2. The PCI device (pdev->dev) - Locking prevents concurrent modifications
>> to the PCI device structure during error handling.
>>
>> The lock assertions added here will be satisfied by device locks introduced
>> in a subsequent patch.
>
> Weird. Can't you add the lock assertions at the same time you add the
> locks?
>
It is a bit. I will try to fix this. I might try adding adding the lockdep()
checks in the later later patch.
>> Introduce get_pci_cxl_host_dev() to return the device responsible for
>> managing the RAS register mapping. This function increments the reference
>> count on the host device to prevent premature resource release during error
>> handling. The caller is responsible for decrementing the reference count.
>> For CXL endpoints, which manage resources without a separate host device,
>> this function returns NULL.
>>
>> Update the AER driver's is_cxl_error() to recognize CXL Port devices in
>> addition to CXL Endpoints, as both now have CXL-specific error handlers.
>>
>> Signed-off-by: Terry Bowman <terry.bowman@....com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
>
> Acked-by: Bjorn Helgaas <bhelgaas@...gle.com>
>
Thanks.
>> @@ -1573,6 +1573,7 @@ static struct cxl_port *find_cxl_port_by_uport(struct device *uport_dev)
>> return to_cxl_port(dev);
>> return NULL;
>> }
>> +EXPORT_SYMBOL_NS_GPL(find_cxl_port_by_uport, "CXL");
>
> The usual export question: is there a modular caller()?
>
This should be non-static and non-exported. I'll change.
>> + dev_warn_once(dev, "Error: Unsupported device type (%X)", pci_pcie_type(pdev));
>
> Maybe "%#x" (add 0x prefix and use lower-case hex, unless there's a
> different CXL convention)?
Ok
-Terry
Powered by blists - more mailing lists