lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 1 May 2020 10:02:04 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Austin.Bolen@...l.com
Cc:     sathyanarayanan.kuppuswamy@...ux.intel.com, bhelgaas@...gle.com,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        ashok.raj@...el.com, Mario.Limonciello@...l.com,
        jonathan.derrick@...el.com, mr.nuke.me@...il.com,
        rjw@...ysocki.net, kbusch@...nel.org, okaya@...nel.org,
        tbaicar@...eaurora.org
Subject: Re: [PATCH v1 1/1] PCI/AER: Use _OSC negotiation to determine AER
 ownership

On Fri, May 01, 2020 at 02:40:23PM +0000, Austin.Bolen@...l.com wrote:
> On 4/30/2020 6:02 PM, Bjorn Helgaas wrote:
> > [Austin, help us understand the FIRMWARE_FIRST bit! :)]
> >
> > On Thu, Apr 30, 2020 at 05:40:22PM -0500, Bjorn Helgaas wrote:
> >> On Sun, Apr 26, 2020 at 11:30:06AM -0700, sathyanarayanan.kuppuswamy@...ux.intel.com wrote:
> >>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>
> >>>
> >>> Currently PCIe AER driver uses HEST FIRMWARE_FIRST bit to
> >>> determine the PCIe AER Capability ownership between OS and
> >>> firmware. This support is added based on following spec
> >>> reference.
> >>>
> >>> Per ACPI spec r6.3, table 18-387, 18-388, 18-389, HEST table
> >>> flags field BIT-0 and BIT-1 can be used to expose the
> >>> ownership of error source between firmware and OSPM.
> >>>
> >>> Bit [0] - FIRMWARE_FIRST: If set, indicates that system
> >>>           firmware will handle errors from this source
> >>>           first.
> >>> Bit [1] – GLOBAL: If set, indicates that the settings
> >>>           contained in this structure apply globally to all
> >>>           PCI Express Bridges.
> >>>
> >>> Although above spec reference states that setting
> >>> FIRMWARE_FIRST bit means firmware will handle the error source
> >>> first, it does not explicitly state anything about AER
> >>> ownership. So using HEST to determine AER ownership is
> >>> incorrect.
> >>>
> >>> Also, as per following specification references, _OSC can be
> >>> used to negotiate the AER ownership between firmware and OS.
> >>> Details are,
> >>>
> >>> Per ACPI spec r6.3, sec 6.2.11.3, table titled “Interpretation
> >>> of _OSC Control Field” and as per PCI firmware specification r3.2,
> >>> sec 4.5.1, table 4-5, OS can set bit 3 of _OSC control field
> >>> to request control over PCI Express AER. If the OS successfully
> >>> receives control of this feature, it must handle error reporting
> >>> through the AER Capability as described in the PCI Express Base
> >>> Specification.
> >>>
> >>> Since above spec references clearly states _OSC can be used to
> >>> determine AER ownership, don't depend on HEST FIRMWARE_FIRST bit.
> >> I pulled out the _OSC part of this to a separate patch.  What's left
> >> is below, and is essentially equivalent to Alex's patch:
> >>
> >>   https://lore.kernel.org/r/20190326172343.28946-3-mr.nuke.me@gmail.com/
> >>
> >> I like what this does, but what I don't like is the fact that we now
> >> have this thing called pcie_aer_get_firmware_first() that is not
> >> connected with the ACPI FIRMWARE_FIRST bit at all.
> >
> > Austin, if we remove this, we'll have no PCIe-related code that looks
> > at the HEST FIRMWARE_FIRST bit at all.  Presumably it's there for some
> > reason, but I'm not sure what the reason is.
> >
> > Alex's mail [1] has a nice table of _OSC AER/HEST FFS bits that looks
> > useful, but the only actionable thing I can see is that in the (1,1)
> > case, OSPM should do some initialization with masks/enables.
> >
> > But I have no clue what that means or how to connect that with the
> > spec.  What are the masks/enables?  Is that something connected with
> > ERST?
> >
> > [1] https://lore.kernel.org/r/20190326172343.28946-1-mr.nuke.me@gmail.com/
> 
> The only values that make sense to me are (1, 0) for full native OS
> init/handling of AER and (0, 1) for the firmware first model where
> firmware does the init and handles errors first then passes control to
> the OS. For these cases the FIRMWARE_FIRST flag in HEST is redundant and
> not needed.
> 
> We did discuss the (1, 1) case in the ACPI working group and there were
> a potential use case (which Alex documented in the link you provided)
> but there is nothing specified in the standard about how that model
> would actually work AFAICT. And no x86 system has the hardware support
> needed for what was proposed that I'm aware of (not sure about other
> architectures).
> 
> So unless and until someone documents how the firmware and OS are
> supposed to behave in the (1, 1) or (0, 0) scenario and expresses a need
> for those models, I would not bother adding support for them.  Just my 2
> cents.

Perfect, I think we should ignore the FIRMWARE_FIRST bit in the HEST
PCIe entries for now.  Thanks a lot for your help with this!

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ