lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 15 Aug 2022 17:02:13 +0800
From:   Manyi Li <limanyi@...ontech.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Bjorn Helgaas <helgaas@...nel.org>
Cc:     Kai-Heng Feng <kai.heng.feng@...onical.com>,
        "Saheed O. Bolarinwa" <refactormyself@...il.com>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Rajat Jain <rajatja@...gle.com>,
        Linux PCI <linux-pci@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Vidya Sagar <vidyas@...dia.com>,
        Matthew Garrett <mjg59@...f.ucam.org>,
        ACPI Devel Maling List <linux-acpi@...r.kernel.org>
Subject: Re: [PATCH] PCI/ASPM: Should not report ASPM support to BIOS if FADT
 indicates ASPM is unsupported



在 2022/7/15 22:07, Rafael J. Wysocki 写道:
> On Fri, Jul 15, 2022 at 2:24 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>>
>> On Fri, Jul 15, 2022 at 9:40 AM Manyi Li <limanyi@...ontech.com> wrote:
>>>
>>>
>>>
>>> On 2022/7/14 11:20, Kai-Heng Feng wrote:
>>>> [+Cc Matthew]
>>>>
>>>> On Thu, Jul 14, 2022 at 2:28 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
>>>>>
>>>>> [+cc Kai-Heng, Vidya, who also have ASPM patches in flight]
>>>>>
>>>>> On Wed, Jul 13, 2022 at 07:26:12PM +0800, Manyi Li wrote:
>>>>>> Startup log of ASUSTeK X456UJ Notebook show:
>>>>>> [    0.130563] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
>>>>>> [   48.092472] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
>>>>>> [   48.092479] pcieport 0000:00:1c.5:   device [8086:9d15] error status/mask=00000001/00002000
>>>>>> [   48.092481] pcieport 0000:00:1c.5:    [ 0] RxErr
>>>>>> [   48.092490] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
>>>>>> [   48.092504] pcieport 0000:00:1c.5: AER: can't find device of ID00e5
>>>>>> [   48.092506] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
>>>>>
>>>>> Can you elaborate on the connection between the FADT ASPM bit and the
>>>>> AER logs above?
>>>
>>> Sorry,I don't know about that.
>>>
>>>>>
>>>>> What problem are we solving here?  A single corrected error being
>>>>> logged?  An infinite stream of errors?  A device that doesn't work at
>>>>> all?
>>>>
>>>> Agree, what's the real symptom of the issue?
>>>
>>> Please see the details of this issus:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=216245
>>>
>>>>
>>>>>
>>>>> We don't need the dmesg timestamps unless they contribute to
>>>>> understanding the problem.  I don't think they do in this case.
>>>>
>>>> According to commit 387d37577fdd ("PCI: Don't clear ASPM bits when the
>>>> FADT declares it's unsupported"), the bit means "just use the ASPM
>>>> bits handed over by BIOS".
>>>>
>>>> However, I do wonder why both drivers/pci/pci-acpi.c and
>>>> drivers/acpi/pci_root.c are doing the ACPI_FADT_NO_ASPM check,
>>
>> Because pci_root.c doesn't read aspm_disabled.
> 
> I've recalled a bit in the meantime.
> 
> First off, ACPI_FADT_NO_ASPM forbids the OS from enabling ASPM control
> (quite literally).  It doesn't mean that the OS should not enumerate
> ASPM and it doesn't mean that it should not report ASPM support to the
> firmware via _OSC.  Moreover, there are (or at least there were)
> systems where the firmware expected ASPM support to be reported via
> _OSC anyway (see commit 8b8bae901ce2 PCI/ACPI: Report ASPM support to
> BIOS if not disabled from command line).
> 
> Thus, if ASPM is not disabled from command line, it would be
> consistent to carry out the _OSC negotiation as usual regardless of
> ACPI_FADT_NO_ASPM and then handle the case in which it is set in the
> same way as the case in which the firmware doesn't grant the kernel
> control of some PCIe features.  Does this sound reasonable

This sound reasonable.

> 
> If it does, I think that ASPM should be enumerated regardless of
> ACPI_FADT_NO_ASPM, but we need to ensure that its configuration is not
> changed in any way if ACPI_FADT_NO_ASPM is set and I'm not sure if
> that is the case now.
> 
> Of course, the same needs to happen when the kernel doesn't get full
> control over PCIe features via _OSC, but AFAICS that case is handled
> in the same way as the above already.
> 
>>>> maybe one of them should be removed?
>>
>> Arguably, pci_root.c could look at aspm_disabled instead of looking at
>> the FADT flag directly.
> 
> Second, if the former does sound reasonable, I'd rather avoid setting
> aspm_disabled from drivers/pci/pci-acpi.c upfront when
> ACPI_FADT_NO_ASPM is set, because doing that is not consistent with
> the above.
> 
> Now, there may be BIOSes that don't expect to be informed of the OS
> support for ASPM via _OSC if ACPI_FADT_NO_ASPM is set, and the
> question is what to do with them.  They clearly are at odds with the
> BIOSes that do expect that to happen (mentioned above), so honestly
> I'm not sure.

I'm not sure my issues is caused by report ASPM support to the firmware 
via _OSC.My issues is the same as this link:
https://groups.google.com/g/fa.linux.kernel/c/0uz8Nr_NVOI

Links to other discussions on this issue:
https://lore.kernel.org/all/20151229155822.GA17321@localhost/T/#u
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173

> 
>>> I think duplicate work has been done, but comment
>>> in drivers/acpi/pci_root.c is
>>> * We want to disable ASPM here, but aspm_disabled
>>> * needs to remain in its state from boot so that we
>>> * properly handle PCIe 1.1 devices.  So we set this
>>> * flag here, to defer the action until after the ACPI
>>> * root scan.
>>>
>>> I don't understand this logic.
>>
>> This is about the case after failing acpi_pci_osc_control_set() and
>> generally we need to defer setting aspm_disabled because of
>> pcie_aspm_sanity_check().
>>
>>>>
>>>>>
>>>>>> Signed-off-by: Manyi Li <limanyi@...ontech.com>
>>>>>> ---
>>>>>>    drivers/pci/pcie/aspm.c | 1 +
>>>>>>    1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>>>>> index a96b7424c9bc..b173d3c75ae7 100644
>>>>>> --- a/drivers/pci/pcie/aspm.c
>>>>>> +++ b/drivers/pci/pcie/aspm.c
>>>>>> @@ -1359,6 +1359,7 @@ void pcie_no_aspm(void)
>>>>>>         if (!aspm_force) {
>>>>>>                 aspm_policy = POLICY_DEFAULT;
>>>>>>                 aspm_disabled = 1;
>>>>>> +             aspm_support_enabled = false;
>>>>>
>>>>> This makes pcie_no_aspm() work the same as booting with
>>>>> "pcie_aspm=off".  That might be reasonable.
>>>>>
>>>>> I do wonder why we need both "aspm_disabled" and
>>>>> "aspm_support_enabled".  And I wonder why we need to set "aspm_policy"
>>>>> when we're disabling ASPM.  But those aren't really connected to your
>>>>> change here.
>>>>
>>>>   From what I can understand "aspm_disabled" means "don't touch ASPM
>>>> left by BIOS", and "aspm_support_enabled" means "whether ASPM is
>>>> disabled via command line".
>>>> There seems to be some overlaps though.
>>>
>>> According to commit 8b8bae901ce23 ("PCI/ACPI: Report ASPM support to
>>> BIOS if not disabled from command line"), "aspm_support_enabled" means
>>> whether or not report ASPM support to the BIOS through _OSC.
>>
>> Right.
> 

-- 
Manyi Li

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ