lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67abe1903a8ed_2d1e2942f@dwillia2-xfh.jf.intel.com.notmuch>
Date: Tue, 11 Feb 2025 15:47:28 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Terry Bowman <terry.bowman@....com>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
	<nifan.cxl@...il.com>, <dave@...olabs.net>, <jonathan.cameron@...wei.com>,
	<dave.jiang@...el.com>, <alison.schofield@...el.com>,
	<vishal.l.verma@...el.com>, <dan.j.williams@...el.com>,
	<bhelgaas@...gle.com>, <mahesh@...ux.ibm.com>, <ira.weiny@...el.com>,
	<oohall@...il.com>, <Benjamin.Cheatham@....com>, <rrichter@....com>,
	<nathan.fontenot@....com>, <Smita.KoralahalliChannabasappa@....com>,
	<lukas@...ner.de>, <ming.li@...omail.com>,
	<PradeepVineshReddy.Kodamati@....com>
Subject: Re: [PATCH v7 04/17] PCI/AER: Modify AER driver logging to report
 CXL or PCIe bus error type

Terry Bowman wrote:
> The AER driver and aer_event tracing currently log 'PCIe Bus Type'
> for all errors.
> 
> Update the driver and aer_event tracing to log 'CXL Bus Type' for CXL
> device errors.
> 
> Signed-off-by: Terry Bowman <terry.bowman@....com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> Reviewed-by: Dave Jiang <dave.jiang@...el.com>
> Reviewed-by: Fan Ni <fan.ni@...sung.com>
> Reviewed-by: Ira Weiny <ira.weiny@...el.com>
> Reviewed-by: Gregory Price <gourry@...rry.net>

> ---
>  drivers/pci/pcie/aer.c  | 14 ++++++++------
>  include/ras/ras_event.h |  9 ++++++---
>  2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 6e8de77d0fc4..f99a1c6fb274 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -694,13 +694,14 @@ static void __aer_print_error(struct pci_dev *dev,
>  
>  void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
>  {
> +	const char *bus_type = pcie_is_cxl(dev) ? "CXL"  : "PCIe";

I was expecting that this would be more than just a CXL link check
because CXL AER events are only a subset of the events that can trigger
an AER interrupt on a CXL device.

Also, with CXL protocol errors the TLP log is sourced from CXL RAS
registers and is distinct from the PCIe ->tlp in 'struct aer_err_info'.

All that to say that I think this patch probably wants to decorate the
bus type in 'struct aer_err_info' and then use that rather than just the ->is_cxl
flag of the device.

In the interest of moving the patch set along perhaps just do something
like this for now and circle back to make it more sophisticated later:

diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 01e51db8d285..eed098c134a6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -533,6 +533,7 @@ static inline bool pci_dev_test_and_set_removed(struct pci_dev *dev)
 struct aer_err_info {
        struct pci_dev *dev[AER_MAX_MULTI_ERR_DEVICES];
        int error_dev_num;
+       bool is_cxl;
 
        unsigned int id:16;
 
@@ -549,6 +550,11 @@ struct aer_err_info {
        struct pcie_tlp_log tlp;        /* TLP Header */
 };
 
+static inline const char *aer_err_bus(struct aer_err_info *info)
+{
+       return info->is_cxl ? "CXL" : "PCIe";
+}
+
 int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info);
 void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
 
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 508474e17183..405f15c878ff 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -1211,6 +1211,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
        /* Must reset in this function */
        info->status = 0;
        info->tlp_header_valid = 0;
+       info->is_cxl = dev->is_cxl;
 
        /* The device might not support AER */
        if (!aer)

Other than that, this patch looks good to me:

Reviewed-by: Dan Williams <dan.j.williams@...el.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ