lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN8PR12MB29005068F39DE028F19084EDB8A0A@BN8PR12MB2900.namprd12.prod.outlook.com>
Date:   Tue, 31 Oct 2023 12:26:31 +0000
From:   Vidya Sagar <vidyas@...dia.com>
To:     Bjorn Helgaas <helgaas@...nel.org>,
        Lorenzo Pieralisi <lpieralisi@...nel.org>
CC:     Vikram Sethi <vsethi@...dia.com>,
        Thierry Reding <treding@...dia.com>,
        Jonathan Hunter <jonathanh@...dia.com>,
        Krishna Thota <kthota@...dia.com>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Question: Clearing error bits in the root port post enumeration

Hi folks,

I would like to know your comments on the following scenario where we are observing the root port logging errors because of the enumeration flow being followed.

DUT information:
- Has a root port and an endpoint connected to it
- Uses ECAM mechanism to access the configuration space
- Booted through ACPI flow
- Has a Firmware-First approach for handling the errors
- System is configured to treat Unsupported Requests as AdvisoryNon-Fatal errors

As we all know, when a configuration read request comes in for a device number that is not implemented, a UR would be returned as per the PCIe spec.

As part of the enumeration flow on DUT, when the kernel reads offset 0x0 of B:D:F=0:0:0, the root port responds with its valid Vendor-ID and Device-ID values.
But, when B:D:F=0:1:0 is probed, since there is no device present there, the root port responds with an Unsupported Request and simultaneously logs the same in the Device Status register (i.e. bit-3).
Because of it, there is a UR logged in the Device Status register of the RP by the time enumeration is complete.

In the case of AER capability natively owned by the kernel, the AER driver's init call would clear all such pending bits.

Since we are going with the Firmware-First approach, and the system is configured to treat Unsupported Requests as AdvisoryNon-Fatal errors, only a correctable error interrupt can be raised to the Firmware which takes care of clearing the corresponding status registers.
The firmware can't know about the UnsupReq bit being set as the interrupt it received is for a correctable error hence it clears only bits related to correctable error.

All these events leave a freshly booted system with the following bits set.

Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-          (MAbort)
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-                                                              (UnsupReq)
UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-   (UnsupReq)

Since the reason for UR is well understood at this point, I would like to weigh in on the idea of clearing the aforementioned bits in the root port once the enumeration is done particularly to cater to the configurations where Firmware-First approach is in place.
Please let me know your comments on this approach.

Thanks,
Vidya Sagar

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ