[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251104182603.GA1862095@bhelgaas>
Date: Tue, 4 Nov 2025 12:26:03 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Terry Bowman <terry.bowman@....com>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
alison.schofield@...el.com, dan.j.williams@...el.com,
bhelgaas@...gle.com, shiju.jose@...wei.com, ming.li@...omail.com,
Smita.KoralahalliChannabasappa@....com, rrichter@....com,
dan.carpenter@...aro.org, PradeepVineshReddy.Kodamati@....com,
lukas@...ner.de, Benjamin.Cheatham@....com,
sathyanarayanan.kuppuswamy@...ux.intel.com,
linux-cxl@...r.kernel.org, alucerop@....com, ira.weiny@...el.com,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [RESEND v13 09/25] PCI/AER: Report CXL or PCIe bus error type in
trace logging
On Tue, Nov 04, 2025 at 11:02:49AM -0600, Terry Bowman wrote:
> The AER service driver and aer_event tracing currently log 'PCIe Bus Type'
> for all errors. Update the driver and aer_event tracing to log 'CXL Bus
> Type' for CXL device errors.
>
> This requires the AER can identify and distinguish between PCIe errors and
> CXL errors.
s/requires the AER/requires that AER/
Acked-by: Bjorn Helgaas <bhelgaas@...gle.com>
> +/**
> + * struct aer_err_info - AER Error Information
> + * @dev: Devices reporting error
> + * @ratelimit_print: Flag to log or not log the devices' error. 0=NotLog/1=Log
> + * @error_devnum: Number of devices reporting an error
> + * @level: printk level to use in logging
> + * @id: Value from register PCI_ERR_ROOT_ERR_SRC
> + * @severity: AER severity, 0-UNCOR Non-fatal, 1-UNCOR fatal, 2-COR
> + * @root_ratelimit_print: Flag to log or not log the root's error. 0=NotLog/1=Log
> + * @multi_error_valid: If multiple errors are reported
> + * @first_error: First reported error
> + * @is_cxl: Bus type error: 0-PCI Bus error, 1-CXL Bus error
> + * @tlp_header_valid: Indicates if TLP field contains error information
> + * @status: COR/UNCOR error status
> + * @mask: COR/UNCOR mask
> + * @tlp: Transaction packet information
> + */
Would you mind splitting this kernel-doc addition and comment move to
its own patch that only does that? That will make the functional
changes more obvious.
> struct aer_err_info {
> struct pci_dev *dev[AER_MAX_MULTI_ERR_DEVICES];
> int ratelimit_print[AER_MAX_MULTI_ERR_DEVICES];
> int error_dev_num;
> - const char *level; /* printk level */
> + const char *level;
>
> unsigned int id:16;
>
> - unsigned int severity:2; /* 0:NONFATAL | 1:FATAL | 2:COR */
> - unsigned int root_ratelimit_print:1; /* 0=skip, 1=print */
> + unsigned int severity:2;
> + unsigned int root_ratelimit_print:1;
> unsigned int __pad1:4;
> unsigned int multi_error_valid:1;
>
> unsigned int first_error:5;
> - unsigned int __pad2:2;
> + unsigned int __pad2:1;
> + bool is_cxl:1;
> unsigned int tlp_header_valid:1;
>
> - unsigned int status; /* COR/UNCOR Error Status */
> - unsigned int mask; /* COR/UNCOR Error Mask */
> - struct pcie_tlp_log tlp; /* TLP Header */
> + unsigned int status;
> + unsigned int mask;
> + struct pcie_tlp_log tlp;
> };
Powered by blists - more mailing lists