lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 14 Feb 2023 16:08:43 -0800
From:   Ira Weiny <ira.weiny@...el.com>
To:     Bjorn Helgaas <helgaas@...nel.org>, Ira Weiny <ira.weiny@...el.com>
CC:     Alison Schofield <alison.schofield@...el.com>,
        Vishal Verma <vishal.l.verma@...el.com>,
        Ben Widawsky <bwidawsk@...nel.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "Mahesh J Salgaonkar" <mahesh@...ux.ibm.com>,
        Oliver O'Halloran <oohall@...il.com>,
        <linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-pci@...r.kernel.org>, <linuxppc-dev@...ts.ozlabs.org>,
        "Jonathan Cameron" <Jonathan.Cameron@...wei.com>,
        Dave Jiang <dave.jiang@...el.com>, Stefan Roese <sr@...x.de>,
        Kuppuswamy Sathyanarayanan 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>
Subject: Re: [PATCH RFC] PCI/AER: Enable internal AER errors by default

Bjorn Helgaas wrote:
> On Fri, Feb 10, 2023 at 02:33:23PM -0800, Ira Weiny wrote:
> > The CXL driver expects internal error reporting to be enabled via
> > pci_enable_pcie_error_reporting().  It is likely other drivers expect the same.
> > Dave submitted a patch to enable the CXL side[1] but the PCI AER registers
> > still mask errors.
> > 
> > PCIe v6.0 Uncorrectable Mask Register (7.8.4.3) and Correctable Mask
> > Register (7.8.4.6) default to masking internal errors.  The
> > Uncorrectable Error Severity Register (7.8.4.4) defaults internal errors
> > as fatal.
> > 
> > Enable internal errors to be reported via the standard
> > pci_enable_pcie_error_reporting() call.  Ensure uncorrectable errors are set
> > non-fatal to limit any impact to other drivers.
> 
> Do you have any background on why the spec makes these errors masked
> by default?  I'm sympathetic to wanting to learn about all the errors
> we can, but I'm a little wary if the spec authors thought it was
> important to mask these by default.
> 

I don't have any idea of the history.

To me 'internal errors' is a pretty wide net and was likely a catch all
that the authors felt was mostly unneeded.

CXL is different because it further divides the errors.

I've enlisted some help internal to Intel to hopefully find some answers.
But in the event no one knows it would be safe to to with my alternate
suggestion and add a new PCIe call to enable this specifically for the
drivers who need it.

Ira

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ