linux-kernel - Re: [RFC PATCH 1/6] PCI/RCEC: Introduce pcie_walk_rcec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <9f23dd31-e7ef-4599-a13c-932ac288266b@intel.com>
Date: Tue, 23 Apr 2024 10:33:23 +0800
From: "Li, Ming" <ming4.li@...el.com>
To: Dan Williams <dan.j.williams@...el.com>, Terry Bowman
	<Terry.Bowman@....com>, <rrichter@....com>
CC: <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 1/6] PCI/RCEC: Introduce pcie_walk_rcec_all()

On 4/23/2024 7:03 AM, Dan Williams wrote:
> Terry Bowman wrote:
> [..]
>>> Hi Terry,
>>>
>>> This patchset is responding to the implications of the implementation
>>> note in 9.18.1.5 RCEC Downstream Port Association Structure (RDPAS).
>>> That says that CXL.io and CXL.cachemem errors in Root Ports may indeed
>>> be signaled to an RCEC. Do you expect that implementation note to cause
>>> any issues on platforms that do not follow that CXL spec behavior?
>>>
>>> My expectation is that it may just cause extra polling for errors, but
>>> not cause any harm.
>>
>> AMD platforms in RCH/RCD mode consume protocol errors in the RCEC's AER driver. AMD 
>> platforms in VH mode consume protocol errors (including root port errors) in the 
>> root port's AER driver. The exception is the VH mode host with CXL1.1 endpoint and 
>> RCH downstream errors. CXL1.1 endpoint and RCH downstream errors in a VH host are 
>> consumed in the RCEC.
> 
> I agree that's the most compatible path for existing software.
> 
>> I don't believe these patchset changes would affect this behavior. But, I will need 
>> to test to confirm.
> 
> As I wrote to Li Ming, I think any potential conflict can further be
> limited by the fact that this extra scanning is limited to CXL.cachemem,
> not typical PCI AER flows.

Agree with Dan, but I think that software does not have a chance to know if the error is a CXL.cachemem error withour RDPAS(only knows it is a uncor_internal_error/cor_internal_error reported by RCEC), maybe we can limit this extra scanning for the RPs working on CXL mode?