[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd1b14d6-8c0e-4e55-942c-3efb3982c010@amd.com>
Date: Thu, 15 Jan 2026 11:29:50 -0600
From: "Bowman, Terry" <terry.bowman@....com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
Cc: dave@...olabs.net, dave.jiang@...el.com, alison.schofield@...el.com,
dan.j.williams@...el.com, bhelgaas@...gle.com, shiju.jose@...wei.com,
ming.li@...omail.com, Smita.KoralahalliChannabasappa@....com,
rrichter@....com, dan.carpenter@...aro.org,
PradeepVineshReddy.Kodamati@....com, lukas@...ner.de,
Benjamin.Cheatham@....com, sathyanarayanan.kuppuswamy@...ux.intel.com,
linux-cxl@...r.kernel.org, vishal.l.verma@...el.com, alucerop@....com,
ira.weiny@...el.com, linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH v14 30/34] PCI/AER: Dequeue forwarded CXL error
On 1/15/2026 10:01 AM, Jonathan Cameron wrote:
> On Wed, 14 Jan 2026 12:20:51 -0600
> Terry Bowman <terry.bowman@....com> wrote:
>
>> The AER driver now forwards CXL protocol errors to the CXL driver via a
>> kfifo. The CXL driver must consume these work items and initiate protocol
>> error handling while ensuring the device's RAS mappings remain valid
>> throughout processing.
>>
>> Implement cxl_proto_err_work_fn() to dequeue work items forwarded by the
>> AER service driver. Lock the parent CXL Port device to ensure the CXL
>> device's RAS registers are accessible during handling. Add pdev reference-put
>> to match reference-get in AER driver. This will ensure pdev access after
>> kfifo dequeue. These changes apply to CXL Ports and CXL Endpoints.
>>
>> Signed-off-by: Terry Bowman <terry.bowman@....com>
>
> Few things inline.
> Thanks,
>
> Jonathan
>
>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>> index bf82880e19b4..0c640b84ad70 100644
>> --- a/drivers/cxl/core/ras.c
>> +++ b/drivers/cxl/core/ras.c
>> @@ -117,17 +117,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>
>> +/*
>> + * Return 'struct cxl_port *' parent CXL Port of dev
>> + *
>> + * Reference count increments returned port on success
>> + *
>> + * @pdev: Find the parent CXL Port of this device
>
> This is a non standard type of a comment. I'd make it formal
> kernel-doc.
>
>
Ok, I'll update it.
>
>> +
>> +static void cxl_proto_err_work_fn(struct work_struct *work)
>> +{
>> + struct cxl_proto_err_work_data wd;
>> +
>> + while (cxl_proto_err_kfifo_get(&wd)) {
>
> I'm probably being slow today but where does that helper come from?
>
drivers/pci/pcie/aer_cxl_vh.c
Thanks for reviewing.
-Terry
>> + struct pci_dev *pdev __free(pci_dev_put) = wd.pdev;
>> +
>> + if (!pdev) {
>> + pr_err_ratelimited("NULL PCI device passed in AER-CXL KFIFO\n");
>> + continue;
>> + }
>> +
>> + struct cxl_port *port __free(put_cxl_port) = get_cxl_port(pdev);
>> + if (!port) {
>> + pr_err_ratelimited("Failed to find parent Port device in CXL topology.\n");
>> + continue;
>> + }
>> + guard(device)(&port->dev);
>> +
>> + cxl_handle_proto_error(&wd);
>> + }
>> +}
>
Powered by blists - more mailing lists