[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181114055956.GA144931@google.com>
Date: Tue, 13 Nov 2018 23:59:56 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Alex_Gagniuc@...lteam.com
Cc: oohall@...il.com, gregkh@...uxfoundation.org,
keith.busch@...el.com, mr.nuke.me@...il.com,
linux-pci@...r.kernel.org, Austin.Bolen@...l.com,
Shyam.Iyer@...l.com, linux-kernel@...r.kernel.org,
jonathan.derrick@...el.com, lukas@...ner.de, ruscur@...sell.cc,
sbobroff@...ux.ibm.com, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is
disconnected
On Tue, Nov 13, 2018 at 10:39:15PM +0000, Alex_Gagniuc@...lteam.com wrote:
> On 11/12/2018 11:02 PM, Bjorn Helgaas wrote:
> >
> > [EXTERNAL EMAIL]
> > Please report any suspicious attachments, links, or requests for sensitive information.
It looks like Dell's email system adds the above in such a way that the
email quoting convention suggests that *I* wrote it, when I did not.
> ...
> > Do you think Linux observes the rule about not touching AER bits on
> > FFS? I'm not sure it does. I'm not even sure what section of the
> > spec is relevant.
>
> I haven't found any place where linux breaks this rule. I'm very
> confident that, unless otherwise instructed, we follow this rule.
Just to make sure we're on the same page, can you point me to this
rule? I do see that OSPM must request control of AER using _OSC
before it touches the AER registers. What I don't see is the
connection between firmware-first and the AER registers.
The closest I can find is the "Enabled" field in the HEST PCIe
AER structures (ACPI v6.2, sec 18.3.2.4, .5, .6), where it says:
If the field value is 1, indicates this error source is
to be enabled.
If the field value is 0, indicates that the error source
is not to be enabled.
If FIRMWARE_FIRST is set in the flags field, the Enabled
field is ignored by the OSPM.
AFAICT, Linux completely ignores the Enabled field in these
structures.
These structures also contain values the OS is apparently supposed to
write to Device Control and several AER registers (in struct
acpi_hest_aer_common). Linux ignores these as well.
These seem like fairly serious omissions in Linux.
> > The whole issue of firmware-first, the mechanism by which firmware
> > gets control, the System Error enables in Root Port Root Control
> > registers, etc., is very murky to me. Jon has a sort of similar issue
> > with VMD where he needs to leave System Errors enabled instead of
> > disabling them as we currently do.
>
> Well, OS gets control via _OSC method, and based on that it should
> touch/not touch the AER bits.
I agree so far.
> The bits that get set/cleared come from _HPX method,
_HPX tells us about some AER registers, Device Control, Link Control,
and some bridge registers. It doesn't say anything about the Root
Control register that Jon is concerned with.
For firmware-first to work, firmware has to get control. How does it
get control? How does OSPM know to either set up that mechanism or
keep its mitts off something firmware set up before handoff? In Jon's
VMD case, I think firmware-first relies on the System Error controlled
by the Root Control register. Linux thinks it owns that, and I don't
know how to learn otherwise.
> and there's a more about the FFS described in ACPI spec. It
> seems that if platform, wants to enable VMD, it should pass the correct
> bits via _HPX. I'm curious to know in what new twisted way FFS doesn't
> work as intended.
Bjorn
Powered by blists - more mailing lists