lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251104190353.GA1865360@bhelgaas>
Date: Tue, 4 Nov 2025 13:03:53 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Terry Bowman <terry.bowman@....com>
Cc: dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
	alison.schofield@...el.com, dan.j.williams@...el.com,
	bhelgaas@...gle.com, shiju.jose@...wei.com, ming.li@...omail.com,
	Smita.KoralahalliChannabasappa@....com, rrichter@....com,
	dan.carpenter@...aro.org, PradeepVineshReddy.Kodamati@....com,
	lukas@...ner.de, Benjamin.Cheatham@....com,
	sathyanarayanan.kuppuswamy@...ux.intel.com,
	linux-cxl@...r.kernel.org, alucerop@....com, ira.weiny@...el.com,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [RESEND v13 15/25] CXL/PCI: Introduce PCI_ERS_RESULT_PANIC

On Tue, Nov 04, 2025 at 11:02:55AM -0600, Terry Bowman wrote:
> The CXL driver's error handling for uncorrectable errors (UCE) will be
> updated in the future. A required change is for the error handlers to
> to force a system panic when a UCE is detected.
> 
> Introduce PCI_ERS_RESULT_PANIC as a 'enum pci_ers_result' type. This will
> be used by CXL UCE fatal and non-fatal recovery in future patches. Update
> PCIe recovery documentation with details of PCI_ERS_RESULT_PANIC.
> 
> Signed-off-by: Terry Bowman <terry.bowman@....com>
> Reviewed-by: Dave Jiang <dave.jiang@...el.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@...wei.com>
> Reviewed-by: Ben Cheatham <benjamin.cheatham@....com>

This patch doesn't actually *do* anything.  There's no possibility of a
bisect landing on it.  I think it would be better to combine this with
something that *uses* PCI_ERS_RESULT_PANIC, maybe the merge_result()
update?

Suggest possible subject prefix of "PCI/ERR" since this really isn't
CXL-specific; it just so happens that you don't know of uses outside
CXL.

> +++ b/Documentation/PCI/pci-error-recovery.rst
> @@ -102,6 +102,8 @@ Possible return values are::
>  		PCI_ERS_RESULT_NEED_RESET,  /* Device driver wants slot to be reset. */
>  		PCI_ERS_RESULT_DISCONNECT,  /* Device has completely failed, is unrecoverable */
>  		PCI_ERS_RESULT_RECOVERED,   /* Device driver is fully recovered and operational */
> +		PCI_ERS_RESULT_NO_AER_DRIVER, /* No AER capabilities registered for the driver */

"AER capabilities" is confusingly similar to the PCIe AER Capability.

I think this really means "there's no
pci_error_handlers.error_detected() callback".

> +		PCI_ERS_RESULT_PANIC,       /* System is unstable, panic. Is CXL specific */
>  	};
>  
>  A driver does not have to implement all of these callbacks; however,
> @@ -116,6 +118,10 @@ The actual steps taken by a platform to recover from a PCI error
>  event will be platform-dependent, but will follow the general
>  sequence described below.
>  
> +PCI_ERS_RESULT_PANIC is currently unique to CXL and handled in CXL
> +cxl_do_recovery(). The PCI pcie_do_recovery() routine does not report or
> +handle PCI_ERS_RESULT_PANIC.

I'm not sure all these mentions of being CXL specific are really
helpful.  I don't think they are actionable to driver writers.

>  STEP 0: Error Event
>  -------------------
>  A PCI bus error is detected by the PCI hardware.  On powerpc, the slot
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 5c4759078d2f..cffa5535f28d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -890,6 +890,9 @@ enum pci_ers_result {
>  
>  	/* No AER capabilities registered for the driver */
>  	PCI_ERS_RESULT_NO_AER_DRIVER = (__force pci_ers_result_t) 6,
> +
> +	/* System is unstable, panic. Is CXL specific */
> +	PCI_ERS_RESULT_PANIC = (__force pci_ers_result_t) 7,
>  };
>  
>  /* PCI bus error event callbacks */
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ