lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 26 Feb 2024 10:33:05 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
Cc: linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	Matthew W Carlis <mattc@...estorage.com>,
	Keith Busch <kbusch@...nel.org>, Lukas Wunner <lukas@...ner.de>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	Bjorn Helgaas <bhelgaas@...gle.com>, stable@...r.kernel.org
Subject: Re: [PATCH v2 1/3] PCI/DPC: Request DPC only if also requesting AER

On Mon, Feb 26, 2024 at 07:46:05AM -0800, Kuppuswamy Sathyanarayanan wrote:
> 
> On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
> > On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
> >> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> >>> From: Bjorn Helgaas <bhelgaas@...gle.com>
> >>>
> >>> When booting with "pci=noaer", we don't request control of AER, but we
> >>> previously *did* request control of DPC, as in the dmesg log attached at
> >>> the bugzilla below:
> >>>
> >>>   Command line: ... pci=noaer
> >>>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> >>>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
> >>>
> >>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> >>> says:
> >>>
> >>>   If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
> >>>   must also set bit 7 of the Support field (indicating support for Error
> >>>   Disconnect Recover notifications) and bits 3 and 4 of the Control field
> >>>   (requesting control of PCI Express Advanced Error Reporting and the PCI
> >>>   Express Capability Structure).
> >>
> >> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
> >> Between _OSC Control Bits".
> >>
> >> Because handling of Downstream Port Containment has a dependency on
> >> Advanced Error Reporting, the operating system is required to
> >> request control over Advanced Error Reporting (bit 3 of the Control
> >> field) while requesting control over Downstream Port Containment
> >> Configuration (bit 7 of the Control field). If the operating system
> >> attempts to claim control of Downstream Port Containment
> >> Configuration without also claiming control over Advanced Error
> >> Reporting, firmware is required to refuse control of the feature
> >> being illegally claimed and mask the corresponding bit.  Firmware is
> >> required to maintain ownership of Advanced Error Reporting if it
> >> retains ownership of Downstream Port Containment Configuration.  If
> >> the operating system sets bit 7 of the Control field, it must set
> >> bit 7 of the Support field, indicating support for the Error
> >> Disconnect Recover event.
> >
> > So I guess you're suggesting that there are two defects here?
> >
> >   1) Linux requested DPC control without requesting AER control.
> >
> >   2) Platform granted DPC control when it shouldn't have.
> >
> > I do agree with that, but obviously we can only fix 1) in Linux.
> 
> Sorry, maybe my comment was not clear. I was just suggesting to
> change the spec reference from r3.3, sec 4.5.1, table 4-5 to r3.3,
> sec 4.5.2.4 "Dependencies Between _OSC Control Bits".

The requirement that the OS request AER control whenever it requests
DPC control is mentioned in both sec 4.5.1 and sec 4.5.2.4.  IMO sec
4.5.2.4 should not exist because the per-bit table in sec 4.5.1 is a
better place for implementation guidance.  4.5.2.4 is easy to miss,
mostly redundant, and hard to integrate with the 4.5.1 table.

What advantage do you see for citing 4.5.2.4 instead of 4.5.1?  The
only real difference I see is that it also points out a firmware
problem.  I don't think the extra text is worth it since it doesn't
motivate the Linux change.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ