lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250818151855.2950059-1-joshua.hahnjy@gmail.com>
Date: Mon, 18 Aug 2025 08:18:53 -0700
From: Joshua Hahn <joshua.hahnjy@...il.com>
To: Terry Bowman <terry.bowman@....com>
Cc: dave@...olabs.net,
	jonathan.cameron@...wei.com,
	dave.jiang@...el.com,
	alison.schofield@...el.com,
	dan.j.williams@...el.com,
	bhelgaas@...gle.com,
	shiju.jose@...wei.com,
	ming.li@...omail.com,
	Smita.KoralahalliChannabasappa@....com,
	rrichter@....com,
	dan.carpenter@...aro.org,
	PradeepVineshReddy.Kodamati@....com,
	lukas@...ner.de,
	Benjamin.Cheatham@....com,
	sathyanarayanan.kuppuswamy@...ux.intel.com,
	linux-cxl@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org
Subject: Re: [PATCH v10 00/17] Enable CXL PCIe Port Protocol Error handling and logging

On Thu, 26 Jun 2025 17:42:35 -0500 Terry Bowman <terry.bowman@....com> wrote:

> This patchset updates CXL Protocol Error handling for CXL Ports and CXL
> Endpoints (EP). The reach of this patchset grew from CXL Ports to include
> EPs as well.
> 
> This patchset is a continuation of v9 found here:
> https://lore.kernel.org/linux-cxl/20250603172239.159260-1-terry.bowman@amd.com/
> 
> The first patch is a small cleanup change to reduce amount of code. 
> 
> The next 2 patches introduce pci_dev::is_cxl, aer_info::is_cxl, and add
> bus string to AER log tracing. aer_info::is_cxl will be used to indicate a
> CXL or PCI error and will be used to direct the error handling flow in
> later patches.
> 
> The next patch introduces a new driver file, pci/pcie/cxl_aer.c, to move
> the existing CXL AER logic into.
> 
> The next 3 patches update the AER driver and CXL driver to use a kfifo. 
> The kfifo is added to offload CXL-AER protocol error work to the CXL
> driver. These patches provide the kfifo work add and work remove. 
> 
> The next 5 patches prepare the CXL driver for adding the updated protocol
> error handlers. This includes adding CXL Port RAS mapping and updating
> interfaces for common support.
> 
> The final 5 patches add the CXL error handlers for CXL EPs and CXL Ports.
> CXL EPs keep the PCIe error handler for cases the EP error is interpreted
> as a PCIe error. These patches also add logic to unmask CXL Protocol Errors
> during port probing, and mask CXL Protocol Errors during port device
> cleanup.

Hello Terry,

Thank you for this new version. I just wanted to add that I have been testing
this new version on a few machines, and it fixes an issue that I was seeing
on v8 of the patchset.

Previously, booting a kernel with the parameter pcie_ports=compat would lead
to a kernel crash caused by a NULL pointer dereference. After I rebased the
kernel to use v10 instead, this went away and I can use pcie_ports=compat
without any complications. I tried looking in to see what the change that
led to this fix was, but couldn't find anything specific. 

It seems like a use-after-free bug and happens specifically in
cxl_dport_init_ras_reporting. Since this new version fixes this issue, pleae
feel free to add my tested-by tag in future versions.

Thank you again for your work on this series! I hope you have a great day.
Joshua Hahn

Tested-by: Joshua Hahn <joshua.hahnjy@...il.com>

Sent using hkml (https://github.com/sjp38/hackermail)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ