[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251208183603.GA3302185@bhelgaas>
Date: Mon, 8 Dec 2025 12:36:03 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Terry Bowman <terry.bowman@....com>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
alison.schofield@...el.com, dan.j.williams@...el.com,
bhelgaas@...gle.com, shiju.jose@...wei.com, ming.li@...omail.com,
Smita.KoralahalliChannabasappa@....com, rrichter@....com,
dan.carpenter@...aro.org, PradeepVineshReddy.Kodamati@....com,
lukas@...ner.de, Benjamin.Cheatham@....com,
sathyanarayanan.kuppuswamy@...ux.intel.com,
linux-cxl@...r.kernel.org, alucerop@....com, ira.weiny@...el.com,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH v13 16/25] CXL/AER: Introduce pcie/aer_cxl_vh.c in AER
driver for forwarding CXL errors
Maybe:
PCI/AER: Add CXL error forwarding in aer_cxl_vh.c
On Mon, Nov 03, 2025 at 06:09:52PM -0600, Terry Bowman wrote:
> CXL virtual hierarchy (VH) RAS handling for CXL Port devices will be added
> soon. This requires a notification mechanism for the AER driver to share
> the AER interrupt with the CXL driver. The notification will be used as an
> indication for the CXL drivers to handle and log the CXL RAS errors.
>
> Note, 'CXL protocol error' terminology will refer to CXL VH and not
> CXL RCH errors unless specifically noted going forward.
>
> Introduce a new file in the AER driver to handle the CXL protocol errors
> named pci/pcie/aer_cxl_vh.c.
>
> Add a kfifo work queue to be used by the AER and CXL drivers. The AER
> driver will be the sole kfifo producer adding work and the cxl_core will be
> the sole kfifo consumer removing work. Add the boilerplate kfifo support.
> Encapsulate the kfifo, RW semaphore, and work pointer in a single structure.
>
> Add CXL work queue handler registration functions in the AER driver. Export
> the functions allowing CXL driver to access. Implement registration
> functions for the CXL driver to assign or clear the work handler function.
> Synchronize accesses using the RW semaphore.
>
> Introduce 'struct cxl_proto_err_work_data' to serve as the kfifo work data.
> This will contain a reference to the erring PCI device and the error
> severity. This will be used when the work is dequeued by the cxl_core driver.
s/erring PCI device/PCI error source device/
> +bool cxl_error_is_native(struct pci_dev *dev)
> +{
> + struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
> +
> + return (pcie_ports_native || host->native_aer);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_error_is_native, "CXL");
I don't see modular callers of any of these that would require
EXPORT().
> +++ b/include/linux/aer.h
> @@ -10,6 +10,7 @@
>
> #include <linux/errno.h>
> #include <linux/types.h>
> +#include <linux/workqueue_types.h>
Looks like "struct work_struct;" would be sufficient without including
linux/workqueue_types.h.
> +struct cxl_proto_err_work_data {
> + int severity;
> + struct pci_dev *pdev;
Is there a reason to order them this way? I would have put the pdev
pointer first because it's the more general part and might result in
better alignment in memory.
Powered by blists - more mailing lists