lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 18 Apr 2023 15:58:15 +0800
From:   LeoLiuoc <LeoLiu-oc@...oxin.com>
To:     Bjorn Helgaas <helgaas@...nel.org>
CC:     <rafael@...nel.org>, <lenb@...nel.org>, <james.morse@....com>,
        <tony.luck@...el.com>, <bp@...en8.de>, <robert.moore@...el.com>,
        <ying.huang@...el.com>, <rdunlap@...radead.org>,
        <bhelgaas@...gle.com>, <linux-acpi@...r.kernel.org>,
        <linux-pci@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <devel@...ica.org>, <CobeChen@...oxin.com>,
        <TonyWWang@...oxin.com>, <ErosZhang@...oxin.com>,
        Sathyanarayanan Kuppuswamy 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Li, Ming" <ming4.li@...el.com>, <leoliu@...oxin.com>
Subject: Re: [PATCH v2 0/5] Parse the PCIe AER and set to relevant registers



在 2023/4/13 0:32, Bjorn Helgaas 写道:
> On Wed, Apr 12, 2023 at 05:11:28PM +0800, LeoLiuoc wrote:
>> 在 2023/4/8 7:18, Bjorn Helgaas 写道:
>>> On Tue, Nov 15, 2022 at 11:11:15AM +0800, LeoLiu-oc wrote:
>>>> From: leoliu-oc <leoliu-oc@...oxin.com>
>>>>
>>>> According to the sec 18.3.2.4, 18.3.2.5 and 18.3.2.6 in ACPI r6.5, the
>>>> register values form HEST PCI Express AER Structure should be written to
>>>> relevant PCIe Device's AER Capabilities. So the purpose of the patch set
>>>> is to extract register values from HEST PCI Express AER structures and
>>>> program them into AER Capabilities. Refer to the ACPI Spec r6.5 for a more
>>>> detailed description.
>>>
>>> I wasn't involved in this part of the ACPI spec, and I don't
>>> understand how this is intended to work.
>>>
>>> I see that this series extracts AER mask, severity, and control
>>> information from the ACPI HEST table and uses it to configure PCIe
>>> devices as they are enumerated.
>>>
>>> What I don't understand is how this relates to ownership of the AER
>>> capability as negotiated by the _OSC method.  Firmware can configure
>>> the AER capability itself, and if it retains control of the AER
>>> capability, the OS can't write to it (with the exception of clearing
>>> EDR error status), so this wouldn't be necessary.
>>
>> There is no relationship between the ownership of the AER related
>> register and the ownership of the AER capability in the OS or
>> Firmware.
> 
> I don't understand this; can you say it another way?  "Ownership of
> the AER related register" and "ownership of the AER capability" sound
> exactly the same to me.
> 

I would like to state that the operation of writing the AER capability 
register of the relevant PCIe device through the HEST PCI Express AER 
structure has nothing to do with the ownership of the AER.

I do not find a direct statement from ACPI Spec r6.5 that allows the OS 
to write the value of the HEST AER register to the AER register of the 
corresponding device without AER control but I looked in ACPI Spec for a 
description of the relationship between writing to the AER register 
through the _HPP/_HPX method and whether the OS requires AER control:
The expression are as follows:
1. OSPM uses the information returned by _HPX to determine how ①to 
configure PCI Functions that are hot- plugged into the system, ②to 
configure Functions not configured by the platform firmware during 
initial system boot, ③and to configure Functions any time they lose 
configuration space settings (e.g. OSPM issues a Secondary Bus 
Reset/Function Level Reset or Downstream Port Containment is triggered).

2. _HPX may return multiple types or Record Settings (each setting in a 
single sub-package.) OSPM is responsible for detecting the type of 
Function and for applying the appropriate settings. OSPM is also 
responsible for detecting the device / port type of the PCI Express 
Function and applying the appropriate settings provided. For example, 
the Secondary Uncorrectable Error Severity and Secondary Uncorrectable 
Error Mask settings of Type 2 record are only applicable to PCI Express 
to PCI-X/PCI Bridge whose device / port type is 1000b. Similarly, AER 
settings are only applicable to hot plug PCI Express devices that 
support the optional AER capability.

3. Note: OSPM may override the settings provided by the _HPX object’s 
Type2 record (PCI Express Settings) or Type3 record (PCI Express 
Descriptor Settings) when OSPM has assumed native control of the 
corresponding feature. For example, if OSPM has assumed ownership of AER 
(via _OSC), OSPM may override AER related settings returned by _HPX.

This means that writing the AER register value by _HPX does not require 
the OS to gain control of the AER. Also from the usage description of 
_HPX, I think ownership of AER means who decides the configuration value 
of the AER register rather than who can write the configuration value. 
Even though the OS does not have control or ownership of the AER, it 
should still write the configuration values determined by the firmware 
to the AER register at the request of the firmware. Therefore, 
considering that HEST AER patch is an effective supplement to _HPP/_HPX 
method when the Firmware does not support the _HPP/_HPX method, I think 
the question about whether OS has control of AER to write the 
information in the HEST AER structure to the AER register of the 
corresponding device is similar to the question about _HPX/_HPP method 
to write the AER information to the AER register of the corresponding 
device. Therefore, the ownership of AER is not considered in this patch.

>> The processing here is to initialize the AER related register, not
>> the AER event. If Firmware is configured with AER register, it will
>> not be able to handle the runtime hot reset and link retrain cases
>> in addition to the hotplug case you mentioned below.
>>
>>> If the OS owns the AER capability, I assume it gets to decide for
>>> itself how to configure AER, no matter what the ACPI HEST says.
>>
>> What information does the OS use to decide how to configure AER? The
>> ACPI Spec has the following description: PCI Express (PCIe) root
>> ports may implement PCIe Advanced Error Reporting (AER) support.
>> This table(HEST) contains  information platform firmware supplies to
>> OSPM for configuring AER support on a given root port. We understand
>> that HEST stands for user to express expectations.
>>
>> In the current implementation, the OS already configures a PCIE
>> device based on _HPP/_HPX method when configuring a PCI device
>> inserted into a hot-plug slot or initial configuration of a PCI
>> device at system boot. HEST is just another way to express the
>> desired configuration of the user.
> 
> Why was the HEST mechanism added if the functionality is equivalent
> to the existing _HPP/_HPX?  There must be something that HEST supplies
> that _HPP/_HPX did not.
> 
> I think we need some things in the commit log (and short comments in
> the code) to help maintain this in the future:
> 
>    - What problem does this solve, e.g., is there some bug that happens
>      because we lack this functionality?
> 
>    - How is this HEST mechanism related to _HPP/_HPX?  What are the
>      differences?
> 
>    - How is this related to _OSC AER ownership?
> 

Yes, I'll add explanations of these issues to the commit log in the next 
release.

> I think we ignore _OSC ownership in the existing _HPP/_HPX code, but
> that seems like a potential problem.  The PCI Firmware spec (r3.3, sec
> 4.5.1) is pretty clear:
> 
>    If control of this feature was requested and denied or was not
>    requested, firmware returns this bit set to 0, and the operating
>    system must not modify the Advanced Error Reporting Capability or
>    the other error enable/status bits listed above.
> 

PCI Firmware Spec is not very clear about the relationship between 
configuring the AER register and the ownership of the AER. ACPI Spec 
v6.5 does specify the use of _HPP or _HPX: writing to the AER register 
through the _HPP/HPX method does not require the OS to acquire control 
of the AER.

Your Sincerely,
LeoLiu-oc

>>> Maybe this is intended for the case where firmware retains AER
>>> ownership but the OS uses native hotplug (pciehp), and this is a way
>>> for the OS to configure new devices as the firmware expects?  But in
>>> that case, we still have the problem that the OS can't write to the
>>> AER capability to do this configuration.
>>>
>>> Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ