[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <672941925f59d_2ce729465@dwillia2-xfh.jf.intel.com.notmuch>
Date: Mon, 4 Nov 2024 13:50:10 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>, Terry Bowman
<terry.bowman@....com>
CC: <ming4.li@...el.com>, <linux-cxl@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>,
<dave@...olabs.net>, <dave.jiang@...el.com>, <alison.schofield@...el.com>,
<vishal.l.verma@...el.com>, <dan.j.williams@...el.com>,
<bhelgaas@...gle.com>, <mahesh@...ux.ibm.com>, <ira.weiny@...el.com>,
<oohall@...il.com>, <Benjamin.Cheatham@....com>, <rrichter@....com>,
<nathan.fontenot@....com>, <Smita.KoralahalliChannabasappa@....com>
Subject: Re: [PATCH v2 05/14] PCI/AER: Add CXL PCIe port correctable error
support in AER service driver
Jonathan Cameron wrote:
> On Fri, 25 Oct 2024 16:02:56 -0500
> Terry Bowman <terry.bowman@....com> wrote:
[..]
> Anyhow, I think it is fine but I would call out that this changes
> things so that the PCI error handlers are no longer called for CXL ports
> if it's an internal error.
>
> With a sentence on that:
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>
> I'm not 100% convinced the path of separate handlers is the way to go
> but we can always change things again if that doesn't work out.
Hmm, if that part is not clear there should at least be more
documentation as to the "why". For me it is the fact that CXL
potentially promotes endpoint errors to region scope recovery actions,
and that PCIe native AER has no concept of AER triggering unrecoverable
system fatal reponse.
To date panic on AER error has only been logic that ACPI APEI can
deploy, and the kernel has no chance to evaluate the error. So, CXL
error handlers is a reflection that these errors are outside of the PCIe
AER error model.
Powered by blists - more mailing lists