lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0gt761WUPn-3HQ3sA+8N_s_yHrSkk6CH1gBW0gy1c_+KA@mail.gmail.com>
Date:   Fri, 15 Jul 2022 16:07:28 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Manyi Li <limanyi@...ontech.com>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "Saheed O. Bolarinwa" <refactormyself@...il.com>,
        Krzysztof WilczyƄski <kw@...ux.com>,
        Rajat Jain <rajatja@...gle.com>,
        Linux PCI <linux-pci@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Vidya Sagar <vidyas@...dia.com>,
        Matthew Garrett <mjg59@...f.ucam.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        ACPI Devel Maling List <linux-acpi@...r.kernel.org>
Subject: Re: [PATCH] PCI/ASPM: Should not report ASPM support to BIOS if FADT
 indicates ASPM is unsupported

On Fri, Jul 15, 2022 at 2:24 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Fri, Jul 15, 2022 at 9:40 AM Manyi Li <limanyi@...ontech.com> wrote:
> >
> >
> >
> > On 2022/7/14 11:20, Kai-Heng Feng wrote:
> > > [+Cc Matthew]
> > >
> > > On Thu, Jul 14, 2022 at 2:28 AM Bjorn Helgaas <helgaas@...nel.org> wrote:
> > >>
> > >> [+cc Kai-Heng, Vidya, who also have ASPM patches in flight]
> > >>
> > >> On Wed, Jul 13, 2022 at 07:26:12PM +0800, Manyi Li wrote:
> > >>> Startup log of ASUSTeK X456UJ Notebook show:
> > >>> [    0.130563] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
> > >>> [   48.092472] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
> > >>> [   48.092479] pcieport 0000:00:1c.5:   device [8086:9d15] error status/mask=00000001/00002000
> > >>> [   48.092481] pcieport 0000:00:1c.5:    [ 0] RxErr
> > >>> [   48.092490] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
> > >>> [   48.092504] pcieport 0000:00:1c.5: AER: can't find device of ID00e5
> > >>> [   48.092506] pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5
> > >>
> > >> Can you elaborate on the connection between the FADT ASPM bit and the
> > >> AER logs above?
> >
> > Sorry,I don't know about that.
> >
> > >>
> > >> What problem are we solving here?  A single corrected error being
> > >> logged?  An infinite stream of errors?  A device that doesn't work at
> > >> all?
> > >
> > > Agree, what's the real symptom of the issue?
> >
> > Please see the details of this issus:
> > https://bugzilla.kernel.org/show_bug.cgi?id=216245
> >
> > >
> > >>
> > >> We don't need the dmesg timestamps unless they contribute to
> > >> understanding the problem.  I don't think they do in this case.
> > >
> > > According to commit 387d37577fdd ("PCI: Don't clear ASPM bits when the
> > > FADT declares it's unsupported"), the bit means "just use the ASPM
> > > bits handed over by BIOS".
> > >
> > > However, I do wonder why both drivers/pci/pci-acpi.c and
> > > drivers/acpi/pci_root.c are doing the ACPI_FADT_NO_ASPM check,
>
> Because pci_root.c doesn't read aspm_disabled.

I've recalled a bit in the meantime.

First off, ACPI_FADT_NO_ASPM forbids the OS from enabling ASPM control
(quite literally).  It doesn't mean that the OS should not enumerate
ASPM and it doesn't mean that it should not report ASPM support to the
firmware via _OSC.  Moreover, there are (or at least there were)
systems where the firmware expected ASPM support to be reported via
_OSC anyway (see commit 8b8bae901ce2 PCI/ACPI: Report ASPM support to
BIOS if not disabled from command line).

Thus, if ASPM is not disabled from command line, it would be
consistent to carry out the _OSC negotiation as usual regardless of
ACPI_FADT_NO_ASPM and then handle the case in which it is set in the
same way as the case in which the firmware doesn't grant the kernel
control of some PCIe features.  Does this sound reasonable?

If it does, I think that ASPM should be enumerated regardless of
ACPI_FADT_NO_ASPM, but we need to ensure that its configuration is not
changed in any way if ACPI_FADT_NO_ASPM is set and I'm not sure if
that is the case now.

Of course, the same needs to happen when the kernel doesn't get full
control over PCIe features via _OSC, but AFAICS that case is handled
in the same way as the above already.

> > > maybe one of them should be removed?
>
> Arguably, pci_root.c could look at aspm_disabled instead of looking at
> the FADT flag directly.

Second, if the former does sound reasonable, I'd rather avoid setting
aspm_disabled from drivers/pci/pci-acpi.c upfront when
ACPI_FADT_NO_ASPM is set, because doing that is not consistent with
the above.

Now, there may be BIOSes that don't expect to be informed of the OS
support for ASPM via _OSC if ACPI_FADT_NO_ASPM is set, and the
question is what to do with them.  They clearly are at odds with the
BIOSes that do expect that to happen (mentioned above), so honestly
I'm not sure.

> > I think duplicate work has been done, but comment
> > in drivers/acpi/pci_root.c is
> > * We want to disable ASPM here, but aspm_disabled
> > * needs to remain in its state from boot so that we
> > * properly handle PCIe 1.1 devices.  So we set this
> > * flag here, to defer the action until after the ACPI
> > * root scan.
> >
> > I don't understand this logic.
>
> This is about the case after failing acpi_pci_osc_control_set() and
> generally we need to defer setting aspm_disabled because of
> pcie_aspm_sanity_check().
>
> > >
> > >>
> > >>> Signed-off-by: Manyi Li <limanyi@...ontech.com>
> > >>> ---
> > >>>   drivers/pci/pcie/aspm.c | 1 +
> > >>>   1 file changed, 1 insertion(+)
> > >>>
> > >>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > >>> index a96b7424c9bc..b173d3c75ae7 100644
> > >>> --- a/drivers/pci/pcie/aspm.c
> > >>> +++ b/drivers/pci/pcie/aspm.c
> > >>> @@ -1359,6 +1359,7 @@ void pcie_no_aspm(void)
> > >>>        if (!aspm_force) {
> > >>>                aspm_policy = POLICY_DEFAULT;
> > >>>                aspm_disabled = 1;
> > >>> +             aspm_support_enabled = false;
> > >>
> > >> This makes pcie_no_aspm() work the same as booting with
> > >> "pcie_aspm=off".  That might be reasonable.
> > >>
> > >> I do wonder why we need both "aspm_disabled" and
> > >> "aspm_support_enabled".  And I wonder why we need to set "aspm_policy"
> > >> when we're disabling ASPM.  But those aren't really connected to your
> > >> change here.
> > >
> > >  From what I can understand "aspm_disabled" means "don't touch ASPM
> > > left by BIOS", and "aspm_support_enabled" means "whether ASPM is
> > > disabled via command line".
> > > There seems to be some overlaps though.
> >
> > According to commit 8b8bae901ce23 ("PCI/ACPI: Report ASPM support to
> > BIOS if not disabled from command line"), "aspm_support_enabled" means
> > whether or not report ASPM support to the BIOS through _OSC.
>
> Right.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ