lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68815a66459e4_134cc710012@dwillia2-xfh.jf.intel.com.notmuch>
Date: Wed, 23 Jul 2025 14:55:50 -0700
From: <dan.j.williams@...el.com>
To: Terry Bowman <terry.bowman@....com>, <dave@...olabs.net>,
	<jonathan.cameron@...wei.com>, <dave.jiang@...el.com>,
	<alison.schofield@...el.com>, <dan.j.williams@...el.com>,
	<bhelgaas@...gle.com>, <shiju.jose@...wei.com>, <ming.li@...omail.com>,
	<Smita.KoralahalliChannabasappa@....com>, <rrichter@....com>,
	<dan.carpenter@...aro.org>, <PradeepVineshReddy.Kodamati@....com>,
	<lukas@...ner.de>, <Benjamin.Cheatham@....com>,
	<sathyanarayanan.kuppuswamy@...ux.intel.com>, <terry.bowman@....com>,
	<linux-cxl@...r.kernel.org>
CC: <linux-kernel@...r.kernel.org>, <linux-pci@...r.kernel.org>
Subject: Re: [PATCH v10 00/17] Enable CXL PCIe Port Protocol Error handling
 and logging

Terry Bowman wrote:
> This patchset updates CXL Protocol Error handling for CXL Ports and CXL
> Endpoints (EP). The reach of this patchset grew from CXL Ports to include
> EPs as well.
[..]
> == Testing ==
> Testing results below shows the Upstream Switch Port UCE and EP UCE errors
> are handled as PCI errors. This is because aer_get_device_error_info() does
> not populate the AER error severity and status in the case of FATAL UCE on
> Upstream Ports and Endpoints. This is intended because the USP link to
> access the device can be compromised. The check for is_cxl_error() and
> is_internal_error() fail as a result and then processes the error as a PCI
> error. Also, the AER event logging is missing the PCIe AER status.

Are those issues "TODO" or permanent quirks of the implementation?

Although looking at the error message they all seem to correctly say "CXL
Bus Error", I guess I am not seting the end user visible problem of the
details you are pointing out here. I.e. LGTM.

[..]
> == Root Port ==
> root@...wman-cxl:~/aer-inject# ./root-ce-inject.sh

Where can I find these inject scripts?

> pcieport 0000:0c:00.0: aer_inject: Injecting errors 00004000/00000000 into device 0000:0c:00.0
> pcieport 0000:0c:00.0: AER: Correctable error message received from 0000:0c:00.0
> aer_event: 0000:0c:00.0 CXL Bus Error: severity=Corrected, Corrected Internal Error, TLP Header=Not available
> pcieport 0000:0c:00.0: CXL Bus Error: severity=Correctable, type=Transaction Layer, (Receiver ID)
> pcieport 0000:0c:00.0:   device [8086:7075] error status/mask=00004000/0000a000
> pcieport 0000:0c:00.0:    [14] CorrIntErr    
> cxl_aer_correctable_error: memdev=0000:0c:00.0 host=pci0000:0c serial=0 status='CRC Threshold Hit'

Hmm, why "memdev=" for a root port error? Will take a look at what
cxl_aer_correctable_error() is doing.

[..] 
> base-commit: 716ba3023561ccacfaa28f988d26717535b8fed1

I cannot find this commit in mainline nor linux-next. Please do try to
base series on mainline tags, or otherwise push a public baseline branch
somewhere. Helps reviewers and build bots.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ