[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <81376530-ecc7-4457-bfed-2e8b65f69f4e@amd.com>
Date: Thu, 2 Nov 2023 18:24:33 -0500
From: Terry Bowman <Terry.Bowman@....com>
To: Dan Williams <dan.j.williams@...el.com>,
Alison Schofield <alison.schofield@...el.com>
Cc: vishal.l.verma@...el.com, ira.weiny@...el.com, bwidawsk@...nel.org,
dave.jiang@...el.com, Jonathan.Cameron@...wei.com,
linux-cxl@...r.kernel.org, Smita.KoralahalliChannabasappa@....com,
rrichter@....com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cxl/pci: Change CXL AER support check to use native AER
Hi Dan and Allison,
On 11/2/23 16:31, Dan Williams wrote:
> Dan Williams wrote:
>> Alison Schofield wrote:
>>> On Thu, Nov 02, 2023 at 10:52:32AM -0500, Terry Bowman wrote:
>>>> Native CXL protocol errors are delivered to the OS through AER
>>>> reporting. The owner of AER owns CXL Protocol error management with
>>>> respect to _OSC negotiation.[1] CXL device errors are handled by a
>>>> separate interrupt with native control gated by _OSC control field
>>>> 'CXL Memory Error Reporting Control'.
>>>>
>>>> The CXL driver incorrectly checks for 'CXL Memory Error Reporting
>>>> Control' before accessing AER registers and caching RCH downport
>>>> AER registers. Replace the current check in these 2 cases with
>>>> native AER checks.
>>>
>>> Hi Terry, Does this have a user visible impact?
>>
>> Saw this after I applied it. It is good feedback in general.
>>
>> The reason I did not ask for this clarification was that this is fixing
>> brand new code and was just using the wrong flag, so I had the context.
>> A backporter will never need to make a judgement call about this patch.
>>
>> The end user impact is that CXL protocol errors that could be handled by
>> AER will not be handled if Linux failed to negotiate memory error
>> handling. Memory errors are strictly related to memory-error-record
>> events, not protocol errors.
>
Right, end user impact is RCH error handling will require using native
memory error/event _OSC control inorder for protocol errors to be logged.
> However, to that point the "Fixes:" tag looks wrong, it should be:
>
> f05fd10d138d cxl/pci: Add RCH downstream port AER register discovery
Correct, it is f05fd10d138d.
Regards,
Terry
Powered by blists - more mailing lists