[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb575cfdcd0a4c50b489fb16eccfbff2@ausx13mps321.AMER.DELL.COM>
Date: Mon, 19 Nov 2018 20:16:59 +0000
From: <Alex_Gagniuc@...lteam.com>
To: <okaya@...nel.org>, <mr.nuke.me@...il.com>, <keith.busch@...el.com>
Cc: <baicar.tyler@...il.com>, <Austin.Bolen@...l.com>,
<Shyam.Iyer@...l.com>, <lukas@...ner.de>, <bhelgaas@...gle.com>,
<rjw@...ysocki.net>, <lenb@...nel.org>, <ruscur@...sell.cc>,
<sbobroff@...ux.ibm.com>, <oohall@...il.com>,
<linux-pci@...r.kernel.org>, <linux-acpi@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH 0/2] PCI/AER: Consistently use _OSC to determine who owns
AER
On 11/19/2018 01:32 PM, Sinan Kaya wrote:
> ACPI 6.2:
>
> 18.3.2.4 PCI Express Root Port AER Structure
>
> Flags:
>
> Bit [0] - FIRMWARE_FIRST: If set, this bit indicates to the OSPM that system
> firmware will handle errors from this source first.
> Bit [1] - GLOBAL: If set, indicates that the settings contained in this
> structure apply globally to all PCI Express Devices.
> All other bits must be set to zero.
>
> It doesn't say shall, may or might. It says will.
It says "system firmware will handle errors". It does not say "system
firmware owns AER registers". In absence on any descriptor text on the
meaning of these tables, this really looks to me like it should be
interpreted as a descriptor of APEI error sources, not a mutex on who
writes to certain bits-- AER in this case.
I don't think that is contradictory or inconsistent.
I also wasn't able to find any reference to HEST in UEFI 2.7, only in
ACPI spec.
> I think It depends on your PCI topology.
>
> For other topologies with multiple PCI root complexes, I can see this being
> used per root complex flag to indicate which root complex needs firmware first
> and which one doesn't.
_OSC is per root bus, so it's already granular enough, right? Why would
it depend on PCI topology?
>> I'd like see how exactly we break one of those elusive systems with _OSC. I
>> suspect _OSC and HEST end up having the same information, and that's why we
>> didn't see any real-life issue with mixing the approaches.
>
> I'm already aware of two systems that rely on HEST table to pass information to
> the OS that firmware first is enabled. Both of the systems do not change their
> _OSC bits during this assuming HEST table has priority over _OSC for firmware
> first.
Are those hax86 systems?
It seems like the systems have broken firmware. I see several ways to
handle broken systems like those:
- Parse both HEST and _OSC, and decide AER ownership with root bridge
granularity. i.e. host_bridge->native_aer is authoritative, but is
derived from both HEST and _OSC
- Add quirks for the broken systems
- Keep doing what we're doing until current code breaks a new system
> If we add this patch, OS will try to claim the AER address space while firmware
> wants exclusive access.
Yay! FFS wants exclusive access, but does not claim it. Oh, FFS!
> As I said in my previous email, the right place to talk about this is UEFI
> forum.
The way I would present the problem to he spec writers is that, although
the spec appears to be consistent, we've seen firmware vendors that made
the wrong assumptions about HEST/_OSC. Instead of describing AER
ownership with _OSC, they attempted to do it with HEST. So we should add
an implementation note, or clarification about this.
Alex
Powered by blists - more mailing lists