lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1968415b-d217-40b4-a8cb-5465f958016e@amd.com>
Date: Mon, 4 Nov 2024 16:05:51 -0600
From: "Bowman, Terry" <terry.bowman@....com>
To: Dan Williams <dan.j.williams@...el.com>,
 Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: ming4.li@...el.com, linux-cxl@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org, dave@...olabs.net,
 dave.jiang@...el.com, alison.schofield@...el.com, vishal.l.verma@...el.com,
 bhelgaas@...gle.com, mahesh@...ux.ibm.com, ira.weiny@...el.com,
 oohall@...il.com, Benjamin.Cheatham@....com, rrichter@....com,
 nathan.fontenot@....com, Smita.KoralahalliChannabasappa@....com
Subject: Re: [PATCH v2 05/14] PCI/AER: Add CXL PCIe port correctable error
 support in AER service driver



On 11/4/2024 3:50 PM, Dan Williams wrote:
> Jonathan Cameron wrote:
>> On Fri, 25 Oct 2024 16:02:56 -0500
>> Terry Bowman <terry.bowman@....com> wrote:
> [..]
>> Anyhow, I think it is fine but I would call out that this changes
>> things so that the PCI error handlers are no longer called for CXL ports
>> if it's an internal error.
>>
>> With a sentence on that:
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>>
>> I'm not 100% convinced the path of separate handlers is the way to go
>> but we can always change things again if that doesn't work out.
> Hmm, if that part is not clear there should at least be more
> documentation as to the "why". For me it is the fact that CXL
> potentially promotes endpoint errors to region scope recovery actions,
> and that PCIe native AER has no concept of AER triggering unrecoverable
> system fatal reponse.
>
> To date panic on AER error has only been logic that ACPI APEI can
> deploy, and the kernel has no chance to evaluate the error. So, CXL
> error handlers is a reflection that these errors are outside of the PCIe
> AER error model.
Hi Dan,

I'll elaborate more and touch on what you mentioned.

Regards,
Terry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ