[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260114182055.46029-35-terry.bowman@amd.com>
Date: Wed, 14 Jan 2026 12:20:55 -0600
From: Terry Bowman <terry.bowman@....com>
To: <dave@...olabs.net>, <jonathan.cameron@...wei.com>,
<dave.jiang@...el.com>, <alison.schofield@...el.com>,
<dan.j.williams@...el.com>, <bhelgaas@...gle.com>, <shiju.jose@...wei.com>,
<ming.li@...omail.com>, <Smita.KoralahalliChannabasappa@....com>,
<rrichter@....com>, <dan.carpenter@...aro.org>,
<PradeepVineshReddy.Kodamati@....com>, <lukas@...ner.de>,
<Benjamin.Cheatham@....com>, <sathyanarayanan.kuppuswamy@...ux.intel.com>,
<linux-cxl@...r.kernel.org>, <vishal.l.verma@...el.com>, <alucerop@....com>,
<ira.weiny@...el.com>
CC: <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
<terry.bowman@....com>
Subject: [PATCH v14 34/34] cxl: Enable CXL protocol errors during CXL Port probe
CXL protocol errors are not enabled for all CXL devices after boot. These
must be enabled inorder to process CXL protocol errors.
Introduce cxl_unmask_proto_interrupts() to call pci_aer_unmask_internal_errors().
pci_aer_unmask_internal_errors() expects the pdev->aer_cap is initialized.
But, dev->aer_cap is not initialized for CXL Upstream Switch Ports and CXL
Downstream Switch Ports. Initialize the dev->aer_cap if necessary. Enable AER
correctable internal errors and uncorrectable internal errors for all CXL
devices.
Signed-off-by: Terry Bowman <terry.bowman@....com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@...el.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@....com>
---
Changes in v13->v14:
- Update commit title's prefix (Bjorn)
Changes in v12->v13:
- Add dev and dev_is_pci() NULL checks in cxl_unmask_proto_interrupts() (Terry)
- Add Dave Jiang's and Ben's review-by
Changes in v11->v12:
- None
Changes in v10->v11:
- Added check for valid PCI devices in is_cxl_error() (Terry)
- Removed check for RCiEP in cxl_handle_proto_err() and
cxl_report_error_detected() (Terry)
---
drivers/cxl/core/port.c | 2 ++
drivers/cxl/core/ras.c | 22 ++++++++++++++++++++++
drivers/cxl/cxlpci.h | 4 ++++
3 files changed, 28 insertions(+)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 0bec10be5d56..588801c5d406 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1828,6 +1828,8 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
rc = cxl_add_ep(dport, &cxlmd->dev);
+ cxl_unmask_proto_interrupts(cxlmd->cxlds->dev);
+
/*
* If the endpoint already exists in the port's list,
* that's ok, it was added on a previous pass.
diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
index 427009a8a78a..e299eb50fbe4 100644
--- a/drivers/cxl/core/ras.c
+++ b/drivers/cxl/core/ras.c
@@ -117,6 +117,24 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
}
static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
+void cxl_unmask_proto_interrupts(struct device *dev)
+{
+ if (!dev || !dev_is_pci(dev))
+ return;
+
+ struct pci_dev *pdev __free(pci_dev_put) = pci_dev_get(to_pci_dev(dev));
+
+ if (!pdev->aer_cap) {
+ pdev->aer_cap = pci_find_ext_capability(pdev,
+ PCI_EXT_CAP_ID_ERR);
+ if (!pdev->aer_cap)
+ return;
+ }
+
+ pci_aer_unmask_internal_errors(pdev);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_unmask_proto_interrupts, "CXL");
+
static void cxl_dport_map_ras(struct cxl_dport *dport)
{
struct cxl_register_map *map = &dport->reg_map;
@@ -127,6 +145,8 @@ static void cxl_dport_map_ras(struct cxl_dport *dport)
else if (cxl_map_component_regs(map, &dport->regs.component,
BIT(CXL_CM_CAP_CAP_ID_RAS)))
dev_dbg(dev, "Failed to map RAS capability.\n");
+
+ cxl_unmask_proto_interrupts(dev);
}
/**
@@ -159,6 +179,8 @@ void devm_cxl_port_ras_setup(struct cxl_port *port)
if (cxl_map_component_regs(map, &port->regs,
BIT(CXL_CM_CAP_CAP_ID_RAS)))
dev_dbg(&port->dev, "Failed to map RAS capability\n");
+
+ cxl_unmask_proto_interrupts(port->uport_dev);
}
EXPORT_SYMBOL_NS_GPL(devm_cxl_port_ras_setup, "CXL");
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 3d70f9b4a193..0c915c0bdfac 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -89,6 +89,7 @@ void __cxl_uport_init_ras_reporting(struct cxl_port *port,
int __cxl_await_media_ready(struct cxl_dev_state *cxlds);
resource_size_t __cxl_rcd_component_reg_phys(struct device *dev,
struct cxl_dport *dport);
+void cxl_unmask_proto_interrupts(struct device *dev);
#else
static inline void cxl_pci_cor_error_detected(struct pci_dev *pdev)
{
@@ -104,6 +105,9 @@ static inline void devm_cxl_dport_ras_setup(struct cxl_dport *dport)
static inline void devm_cxl_port_ras_setup(struct cxl_port *port)
{
}
+static inline void cxl_unmask_proto_interrupts(struct device *dev)
+{
+}
#endif
int cxl_port_setup_regs(struct cxl_port *port,
--
2.34.1
Powered by blists - more mailing lists