[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251001164248.0000182a@huawei.com>
Date: Wed, 1 Oct 2025 16:42:48 +0100
From: Jonathan Cameron <jonathan.cameron@...wei.com>
To: Terry Bowman <terry.bowman@....com>
CC: <dave@...olabs.net>, <dave.jiang@...el.com>, <alison.schofield@...el.com>,
<dan.j.williams@...el.com>, <bhelgaas@...gle.com>, <shiju.jose@...wei.com>,
<ming.li@...omail.com>, <Smita.KoralahalliChannabasappa@....com>,
<rrichter@....com>, <dan.carpenter@...aro.org>,
<PradeepVineshReddy.Kodamati@....com>, <lukas@...ner.de>,
<Benjamin.Cheatham@....com>, <sathyanarayanan.kuppuswamy@...ux.intel.com>,
<linux-cxl@...r.kernel.org>, <alucerop@....com>, <ira.weiny@...el.com>,
<linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v12 06/25] CXL/AER: Introduce aer_cxl_rch.c into AER
driver for handling CXL RCH errors
On Thu, 25 Sep 2025 17:34:21 -0500
Terry Bowman <terry.bowman@....com> wrote:
> The restricted CXL Host (RCH) AER error handling logic currently resides
> in the AER driver file, drivers/pci/pcie/aer.c. CXL specific changes are
> conditionally compiled using #ifdefs.
>
> Improve the AER driver maintainability by separating the RCH specific logic
> from the AER driver's core functionality and removing the ifdefs. Introduce
> drivers/pci/pcie/aer_cxl_rch.c for moving the RCH AER logic into.
> Conditionally compile the file using the CONFIG_CXL_RCH_RAS Kconfig.
>
> Move the CXL logic into the new file but leave helper functions in aer.c
> for now as they will be moved in future patch for CXL virtual hierarchy
> handling. Export the handler functions as needed. Export
> pci_aer_unmask_internal_errors() allowing for all subsystems to use.
> Avoid multiple declaration moves and export cxl_error_is_native() now to
> allow for cxl_core access.
>
> Inorder to maintain compilation after the move other changes are required.
> Change cxl_rch_handle_error() & cxl_rch_enable_rcec() to be non-static
> inorder for accessing from the AER driver in aer.c.
>
> Signed-off-by: Terry Bowman <terry.bowman@....com>
>
> ---
>
> Changes in v11->v12:
> - Rename drivers/pci/pcie/cxl_rch.c to drivers/pci/pcie/aer_cxl_rch.c (Lukas)
> - Removed forward declararation of 'struct aer_err_info' in pci/pci.h (Terry)
Unwise given the bot reply.
Fun is that it's only needed I think in the !CONFIG_CXL_RCH_RAS bit as that
can occur with !CONFIG_PCIE_AER.
Other than that, just a few trivial comments.
Reviewed-by: Jonathan Cameron <jonathan.cameron@...wei.com>
> diff --git a/drivers/pci/pcie/aer_cxl_rch.c b/drivers/pci/pcie/aer_cxl_rch.c
> new file mode 100644
> index 000000000000..bfe071eebf67
> --- /dev/null
> +++ b/drivers/pci/pcie/aer_cxl_rch.c
> @@ -0,0 +1,99 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2025 AMD Corporation. All rights reserved. */
For a code move, the date at least should I think be a bit older.
> +
> +#include <linux/pci.h>
> +#include <linux/aer.h>
> +#include <linux/bitfield.h>
> +#include "../pci.h"
> +
> +static bool is_cxl_mem_dev(struct pci_dev *dev)
> +{
> + /*
> + * The capability, status, and control fields in Device 0,
> + * Function 0 DVSEC control the CXL functionality of the
> + * entire device (CXL 3.0, 8.1.3).
> + */
> + if (dev->devfn != PCI_DEVFN(0, 0))
> + return false;
> +
> + /*
> + * CXL Memory Devices must have the 502h class code set (CXL
> + * 3.0, 8.1.12.1).
> + */
> + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL)
> + return false;
> +
> + return true;
> +}
> +
> +static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data)
> +{
> + struct aer_err_info *info = (struct aer_err_info *)data;
> + const struct pci_error_handlers *err_handler;
> +
> + if (!is_cxl_mem_dev(dev) || !cxl_error_is_native(dev))
> + return 0;
> +
> + /* Protect dev->driver */
> + device_lock(&dev->dev);
Unrelated but guard() might be nice to use here. Perhaps that's
in a later patch.
> +
> + err_handler = dev->driver ? dev->driver->err_handler : NULL;
> + if (!err_handler)
> + goto out;
> +
> + if (info->severity == AER_CORRECTABLE) {
> + if (err_handler->cor_error_detected)
> + err_handler->cor_error_detected(dev);
> + } else if (err_handler->error_detected) {
> + if (info->severity == AER_NONFATAL)
> + err_handler->error_detected(dev, pci_channel_io_normal);
> + else if (info->severity == AER_FATAL)
> + err_handler->error_detected(dev, pci_channel_io_frozen);
> + }
> +out:
> + device_unlock(&dev->dev);
> + return 0;
> +}
Powered by blists - more mailing lists