[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6882964d627d8_134cc710044@dwillia2-xfh.jf.intel.com.notmuch>
Date: Thu, 24 Jul 2025 13:23:41 -0700
From: <dan.j.williams@...el.com>
To: "Bowman, Terry" <terry.bowman@....com>, <dan.j.williams@...el.com>,
<dave@...olabs.net>, <jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
<alison.schofield@...el.com>, <bhelgaas@...gle.com>, <shiju.jose@...wei.com>,
<ming.li@...omail.com>, <Smita.KoralahalliChannabasappa@....com>,
<rrichter@....com>, <dan.carpenter@...aro.org>,
<PradeepVineshReddy.Kodamati@....com>, <lukas@...ner.de>,
<Benjamin.Cheatham@....com>, <sathyanarayanan.kuppuswamy@...ux.intel.com>,
<linux-cxl@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v10 04/17] CXL/AER: Introduce CXL specific AER driver file
Bowman, Terry wrote:
> On 7/23/2025 8:16 PM, dan.j.williams@...el.com wrote:
> > Terry Bowman wrote:
> >> The CXL AER error handling logic currently resides in the AER driver file,
> >> drivers/pci/pcie/aer.c. CXL specific changes are conditionally compiled
> >> using #ifdefs.
> >>
> >> Improve the AER driver maintainability by separating the CXL specific logic
> >> from the AER driver's core functionality and removing the #ifdefs.
> >> Introduce drivers/pci/pcie/cxl_aer.c and move the CXL AER logic into the
> >> new file.
> >>
> >> Update the makefile to conditionally compile the CXL file using the
> >> existing CONFIG_PCIEAER_CXL Kconfig.
> >>
> >> Signed-off-by: Terry Bowman <terry.bowman@....com>
> >> ---
> > After reading patch5 I want to qualify my Reviewed-by:...
> >
> >> drivers/pci/pci.h | 8 +++
> >> drivers/pci/pcie/Makefile | 1 +
> >> drivers/pci/pcie/aer.c | 138 -------------------------------------
> >> drivers/pci/pcie/cxl_aer.c | 138 +++++++++++++++++++++++++++++++++++++
> > This is a poor name for this file because the functionality only relates to
> > code that supports a dead-end generation of RCH / RCD hardware platforms.
> >
> > I do agree that it should be removed from aer.c so typical PCIe AER
> > maintenance does not need to trip over that cruft.
> >
> > Please call it something like rch_aer.c so it is tucked out of the way,
> > sticks out as odd in any future diffstat, and does not confuse from the
> > CXL VH error handling that supports current and future generation
> > hardware.
> >
> > Perhaps even move it to its own silent Kconfig symbol with a deprecation
> > warning, something like below, so someone remembers to delete it.
>
> cxl_rch_handle_error_iter() and cxl_rch_handle_error() need to be moved from pci/pcie/cxl_aer.c
> into cxl/core/native_ras.c introduced in this series. There is no RCH or VH handling in cxl_aer.c.
> cxl_aer.c serves to detect if an error is a CXL error and if it is then it forwards it to the
> CXL drivers using the kfifo introduced later. I will update the commit message stating more
> will be added later.
Wait, this set moves the same function to a new file twice in the same
set? I had not gotten that far along, but that's not acceptable.
The reasons I had assumed that the rch bits would remain as a vestigial
drivers/pci/pcie/rch_aer.c file to be cut from the kernel later are:
- The goal of forwarding protocol errors to the cxl_core is that the
cxl_core maintains a cxl_port hierarchy. For the RCH case there is no
hierarchy and little to no value in being able disposition or decorate
error reports with the cxl_port driver.
- The RCH code requires a series of new PCI core exports for this
one-off unfortunate mistake of history where the CXL specification
tried way too hard to hide the presence of CXL. If this code is
already on a deprecation path, that contraindicates new exports.
> Dave Jiang introduced cxl/core/pci_aer.c I understand the name is still up for possible change.
> The native_ras.c changes in this series is planned to be moved into cxl/core/pci_aer.c for v11.
> The files were created with the same purpose but we used different filenames and need to converge.
Why not put this stuff in the existing cxl/core/ras.c? I do expect that
we want to route CPER reports to cxl_port objects at some point, so the
"native" distinction is more confusing than beneficial as far as I can
see.
Powered by blists - more mailing lists