lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <910f6bda-4f18-47a9-9150-8489685c857d@amd.com>
Date: Wed, 10 Sep 2025 10:26:19 -0500
From: "Bowman, Terry" <terry.bowman@....com>
To: Lukas Wunner <lukas@...ner.de>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
 alison.schofield@...el.com, dan.j.williams@...el.com, bhelgaas@...gle.com,
 shiju.jose@...wei.com, ming.li@...omail.com,
 Smita.KoralahalliChannabasappa@....com, rrichter@....com,
 dan.carpenter@...aro.org, PradeepVineshReddy.Kodamati@....com,
 Benjamin.Cheatham@....com, sathyanarayanan.kuppuswamy@...ux.intel.com,
 linux-cxl@...r.kernel.org, alucerop@....com, ira.weiny@...el.com,
 linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH v11 09/23] PCI/AER: Report CXL or PCIe bus error type in
 trace logging



On 8/27/2025 2:37 AM, Lukas Wunner wrote:
> On Tue, Aug 26, 2025 at 08:35:24PM -0500, Terry Bowman wrote:
>> The AER service driver and aer_event tracing currently log 'PCIe Bus Type'
>> for all errors. Update the driver and aer_event tracing to log 'CXL Bus
>> Type' for CXL device errors.
>>
>> This requires the AER can identify and distinguish between PCIe errors and
>> CXL errors.
>>
>> Introduce boolean 'is_cxl' to 'struct aer_err_info'. Add assignment in
>> aer_get_device_error_info() and pci_print_aer().
>>
>> Update the aer_event trace routine to accept a bus type string parameter.
> aer_print_error() has a pointer to the struct pci_dev and you've added
> an is_cxl bit to that struct in the preceding patch.
>
> Is there a reason why you can't just use that dev->is_cxl bit, in lieu of
> adding another is_cxl bit to struct aer_err_info?
>
> If so, please document it in a code comment or at least in the commit
> message.  If there isn't, please use dev->is_cxl.
>
> Thanks,
>
> Lukas
Hi Lukas,

The addition of 'is_cxl' member to 'struct aer_err_info' was requested by Dan Williams
during v7 review:
https://lore.kernel.org/linux-cxl/67abe1903a8ed_2d1e2942f@dwillia2-xfh.jf.intel.com.notmuch/

My understanding is the change was requested to encapsulate the bus error 
type with the actual AER status. This is helpful when considering the 
actual device bus state can change between capturing the AER status and 
handling/logging. An example is a training HW error. Caching the 'is_cxl' will allow 
the drivers to properly identify the error bus type for further logging and 
handling.

Hopefully Dan will add his thoughts here.

Regards,
Terry


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ