lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f70d062f-0889-4ae4-93f7-f2c7578b8bf3@amd.com>
Date: Fri, 3 Oct 2025 15:12:05 -0500
From: "Cheatham, Benjamin" <benjamin.cheatham@....com>
To: Terry Bowman <terry.bowman@....com>
CC: <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
	<dave@...olabs.net>, <jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
	<alison.schofield@...el.com>, <dan.j.williams@...el.com>,
	<bhelgaas@...gle.com>, <shiju.jose@...wei.com>, <ming.li@...omail.com>,
	<Smita.KoralahalliChannabasappa@....com>, <rrichter@....com>,
	<dan.carpenter@...aro.org>, <PradeepVineshReddy.Kodamati@....com>,
	<lukas@...ner.de>, <sathyanarayanan.kuppuswamy@...ux.intel.com>,
	<linux-cxl@...r.kernel.org>, <alucerop@....com>, <ira.weiny@...el.com>
Subject: Re: [PATCH v12 25/25] CXL/PCI: Disable CXL protocol error interrupts
 during CXL Port cleanup

On 9/25/2025 5:34 PM, Terry Bowman wrote:
> During CXL device cleanup the CXL PCIe Port device interrupts remain
> enabled. This potentially allows unnecessary interrupt processing on
> behalf of the CXL errors while the device is destroyed.
> 
> Disable CXL protocol errors by setting the CXL devices' AER mask register.
> 
> Introduce pci_aer_mask_internal_errors() similar to pci_aer_unmask_internal_errors().
> Add to the AER service driver allowing other subsystems to use.
> 
> Introduce cxl_mask_proto_interrupts() to call pci_aer_mask_internal_errors().
> Add calls to cxl_mask_proto_interrupts() within CXL Port teardown for CXL
> Root Ports, CXL Downstream Switch Ports, CXL Upstream Switch Ports, and CXL
> Endpoints. Follow the same "bottom-up" approach used during CXL Port
> teardown.
> 
> Signed-off-by: Terry Bowman <terry.bowman@....com>
> Reviewed-by: Dave Jiang <dave.jiang@...el.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@...wei.com>
> 
> ---
> 
> Changes in v11->v12:
> - Keep pci_aer_mask_internal_errors() in driver/pci/pcie/aer.c (Lukas)
> - Update commit description for pci_aer_mask_internal_errors()
> - Add check `if (port->parent_dport)` in delete_switch_port() (Terry)
> 
> Changes in v10->v11:
> - Removed guard() cxl_mask_proto_interrupts(). RP was blocking during
>   testing. (Terry)
> ---
>  drivers/cxl/core/core.h |  2 ++
>  drivers/cxl/core/port.c | 10 +++++++++-
>  drivers/cxl/core/ras.c  |  7 +++++++
>  drivers/pci/pcie/aer.c  | 21 +++++++++++++++++++++
>  include/linux/aer.h     |  2 ++
>  5 files changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index 3197a71bf7b8..db318a81034a 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -158,6 +158,7 @@ void cxl_cor_error_detected(struct device *dev);
>  pci_ers_result_t cxl_error_detected(struct device *dev);
>  void cxl_port_cor_error_detected(struct device *dev);
>  pci_ers_result_t cxl_port_error_detected(struct device *dev);
> +void cxl_mask_proto_interrupts(struct device *dev);
>  #else
>  static inline int cxl_ras_init(void)
>  {
> @@ -187,6 +188,7 @@ static inline pci_ers_result_t cxl_port_error_detected(struct device *dev)
>  {
>  	return PCI_ERS_RESULT_NONE;
>  }
> +static inline void cxl_mask_proto_interrupts(struct device *dev) { }
>  #endif // CONFIG_CXL_RAS
>  
>  int cxl_gpf_port_setup(struct cxl_dport *dport);
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index f34a44abb2c9..337a165e8dcd 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1434,6 +1434,10 @@ EXPORT_SYMBOL_NS_GPL(cxl_endpoint_autoremove, "CXL");
>   */
>  static void delete_switch_port(struct cxl_port *port)
>  {
> +	cxl_mask_proto_interrupts(port->uport_dev);
> +	if (port->parent_dport)
> +		cxl_mask_proto_interrupts(port->parent_dport->dport_dev);
> +
>  	devm_release_action(port->dev.parent, cxl_unlink_parent_dport, port);
>  	devm_release_action(port->dev.parent, cxl_unlink_uport, port);
>  	devm_release_action(port->dev.parent, unregister_port, port);
> @@ -1455,8 +1459,10 @@ static void del_dports(struct cxl_port *port)
>  
>  	device_lock_assert(&port->dev);
>  
> -	xa_for_each(&port->dports, index, dport)
> +	xa_for_each(&port->dports, index, dport) {
> +		cxl_mask_proto_interrupts(dport->dport_dev);

Should this call get moved into del_dport()? I think the dports can get
deleted as the downstream devices leave, which would skip masking the protocol interrupts
on said dports.

If that's not the case, then:
Reviewed-by: Ben Cheatham <benjamin.cheatham@....com>

>  		del_dport(dport);
> +	}


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ