lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6UAk_L22eqiWCix@gourry-fedora-PF4VCD3F>
Date: Thu, 6 Feb 2025 13:33:55 -0500
From: Gregory Price <gourry@...rry.net>
To: Terry Bowman <terry.bowman@....com>
Cc: linux-cxl@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-pci@...r.kernel.org, nifan.cxl@...il.com, dave@...olabs.net,
	jonathan.cameron@...wei.com, dave.jiang@...el.com,
	alison.schofield@...el.com, vishal.l.verma@...el.com,
	dan.j.williams@...el.com, bhelgaas@...gle.com, mahesh@...ux.ibm.com,
	ira.weiny@...el.com, oohall@...il.com, Benjamin.Cheatham@....com,
	rrichter@....com, nathan.fontenot@....com,
	Smita.KoralahalliChannabasappa@....com, lukas@...ner.de,
	ming.li@...omail.com, PradeepVineshReddy.Kodamati@....com,
	alucerop@....com
Subject: Re: [PATCH v5 05/16] PCI/AER: Add CXL PCIe Port correctable error
 support in AER service driver

On Tue, Jan 07, 2025 at 08:38:41AM -0600, Terry Bowman wrote:
> The AER service driver supports handling Downstream Port Protocol Errors in
> Restricted CXL host (RCH) mode also known as CXL1.1. It needs the same
> functionality for CXL PCIe Ports operating in Virtual Hierarchy (VH)
> mode.[1]
> 
> CXL and PCIe Protocol Error handling have different requirements that
> necessitate a separate handling path. The AER service driver may try to
> recover PCIe uncorrectable non-fatal errors (UCE). The same recovery is not
> suitable for CXL PCIe Port devices because of potential for system memory
> corruption. Instead, CXL Protocol Error handling must use a kernel panic
> in the case of a fatal or non-fatal UCE. The AER driver's PCIe Protocol
> Error handling does not panic the kernel in response to a UCE.
>

Naive question: is a panic actually required if the memory is a userland
resource?

The code in arch/x86/kernel/cpu/mce/core.c suggests we may not panic
if an uncorrectable error occurs in this fashion, but simply a SIGBUS.

Unless this is down the wrong pipe - in which case disregard.

I'm still digging through background on this patch set so I may be
barking up the wrong tree.

~Gregory

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ