lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1eb0fa27924f426992715684f5e63346@ausx13mps321.AMER.DELL.COM>
Date:   Wed, 14 Nov 2018 19:22:04 +0000
From:   <Alex_Gagniuc@...lteam.com>
To:     <helgaas@...nel.org>
Cc:     <oohall@...il.com>, <gregkh@...uxfoundation.org>,
        <keith.busch@...el.com>, <mr.nuke.me@...il.com>,
        <linux-pci@...r.kernel.org>, <Austin.Bolen@...l.com>,
        <Shyam.Iyer@...l.com>, <linux-kernel@...r.kernel.org>,
        <jonathan.derrick@...el.com>, <lukas@...ner.de>,
        <ruscur@...sell.cc>, <sbobroff@...ux.ibm.com>,
        <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH v2] PCI/MSI: Don't touch MSI bits when the PCI device is
 disconnected

On 11/14/2018 12:00 AM, Bjorn Helgaas wrote:
> On Tue, Nov 13, 2018 at 10:39:15PM +0000, Alex_Gagniuc@...lteam.com wrote:
>> On 11/12/2018 11:02 PM, Bjorn Helgaas wrote:
>>>
>>> [EXTERNAL EMAIL]
>>> Please report any suspicious attachments, links, or requests for sensitive information.
> 
> It looks like Dell's email system adds the above in such a way that the
> email quoting convention suggests that *I* wrote it, when I did not.

I was wondering why you thought I was suspicious. It's a recent 
(server-side) change. You used to be able to disable these sort of 
notices. I'm told back in the day people were asked to delete emails 
before reading them.

>> ...
>>> Do you think Linux observes the rule about not touching AER bits on
>>> FFS?  I'm not sure it does.  I'm not even sure what section of the
>>> spec is relevant.
>>
>> I haven't found any place where linux breaks this rule. I'm very
>> confident that, unless otherwise instructed, we follow this rule.
> 
> Just to make sure we're on the same page, can you point me to this
> rule?  I do see that OSPM must request control of AER using _OSC
> before it touches the AER registers.  What I don't see is the
> connection between firmware-first and the AER registers.

ACPI 6.2 - 6.2.11.3, Table 6-197:

PCI Express Advanced Error Reporting control:
  * The firmware sets this bit to 1 to grant control over PCI Express 
Advanced Error Reporting. If firmware allows the OS control of this 
feature, then in the context of the _OSC method it must ensure that 
error messages are routed to device interrupts as described in the PCI 
Express Base Specification[...]

Now I'm confused too:
  * HEST -> __aer_firmware_first
	This is used for touching/not touching AER bits
  * _OSC -> bridge->native_aer
	Used to enable/not enable AER portdrv service
Maybe Keith knows better why we're doing it this way. From ACPI text, it 
doesn't seem that control of AER would be tied to HEST entries, although 
in practice, it is.

> The closest I can find is the "Enabled" field in the HEST PCIe
> AER structures (ACPI v6.2, sec 18.3.2.4, .5, .6), where it says:
> 
>    If the field value is 1, indicates this error source is
>    to be enabled.
> 
>    If the field value is 0, indicates that the error source
>    is not to be enabled.
> 
>    If FIRMWARE_FIRST is set in the flags field, the Enabled
>    field is ignored by the OSPM.
> 
> AFAICT, Linux completely ignores the Enabled field in these
> structures.

I don't think ignoring the field is a problem:
  * With FFS, OS should ignore it.
  * Without FFS, we have control, and we get to make the decisions anyway.
In the latter case we decide whether to use AER, independent of the crap 
in ACPI. I'm not even sure why "Enabled" matters in native AER handling. 
Probably one of the check-boxes in "Binary table designer's handbook"?

> These structures also contain values the OS is apparently supposed to
> write to Device Control and several AER registers (in struct
> acpi_hest_aer_common).  Linux ignores these as well.
> 
> These seem like fairly serious omissions in Linux.

I think HPX carries the same sort of information (except for Root 
Command reg). FW is supposed to program those registers anyway, so even 
if OS doesn't touch them, I'd expect things to just work.

>>> The whole issue of firmware-first, the mechanism by which firmware
>>> gets control, the System Error enables in Root Port Root Control
>>> registers, etc., is very murky to me.  Jon has a sort of similar issue
>>> with VMD where he needs to leave System Errors enabled instead of
>>> disabling them as we currently do.
>>
>> Well, OS gets control via _OSC method, and based on that it should
>> touch/not touch the AER bits.
> 
> I agree so far.
> 
>> The bits that get set/cleared come from _HPX method,
> 
> _HPX tells us about some AER registers, Device Control, Link Control,
> and some bridge registers.  It doesn't say anything about the Root
> Control register that Jon is concerned with.

_HPX type 3 (yay!!!) got approved recently, and that will have more 
fine-grained control. It will be able to handle root control reg.

> For firmware-first to work, firmware has to get control.  How does it
> get control?  How does OSPM know to either set up that mechanism or
> keep its mitts off something firmware set up before handoff?

My understanding is that, if FW keeps control of AER in _OSC, then it 
will have set things up to get notified instead of the OS. OSPM not 
touching AER bits is to make sure it doesn't mess up FW's setup. I think 
there are some proprietary bits in the root port to route interrupts to 
SMIs instead of the AER vectors.

> In Jon's
> VMD case, I think firmware-first relies on the System Error controlled
> by the Root Control register.  Linux thinks it owns that, and I don't
> know how to learn otherwise.

Didn't Keith say the root port is not visible to the OS?

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ