lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <672941925f59d_2ce729465@dwillia2-xfh.jf.intel.com.notmuch>
Date: Mon, 4 Nov 2024 13:50:10 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>, Terry Bowman
	<terry.bowman@....com>
CC: <ming4.li@...el.com>, <linux-cxl@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
	<dave@...olabs.net>, <dave.jiang@...el.com>, <alison.schofield@...el.com>,
	<vishal.l.verma@...el.com>, <dan.j.williams@...el.com>,
	<bhelgaas@...gle.com>, <mahesh@...ux.ibm.com>, <ira.weiny@...el.com>,
	<oohall@...il.com>, <Benjamin.Cheatham@....com>, <rrichter@....com>,
	<nathan.fontenot@....com>, <Smita.KoralahalliChannabasappa@....com>
Subject: Re: [PATCH v2 05/14] PCI/AER: Add CXL PCIe port correctable error
 support in AER service driver

Jonathan Cameron wrote:
> On Fri, 25 Oct 2024 16:02:56 -0500
> Terry Bowman <terry.bowman@....com> wrote:
[..]
> Anyhow, I think it is fine but I would call out that this changes
> things so that the PCI error handlers are no longer called for CXL ports
> if it's an internal error.
> 
> With a sentence on that:
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> 
> I'm not 100% convinced the path of separate handlers is the way to go
> but we can always change things again if that doesn't work out.

Hmm, if that part is not clear there should at least be more
documentation as to the "why". For me it is the fact that CXL
potentially promotes endpoint errors to region scope recovery actions,
and that PCIe native AER has no concept of AER triggering unrecoverable
system fatal reponse.

To date panic on AER error has only been logic that ACPI APEI can
deploy, and the kernel has no chance to evaluate the error. So, CXL
error handlers is a reflection that these errors are outside of the PCIe
AER error model.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ