[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0hJHxVvyNdvDSZg=Pn9=xEqO79T4Ou9toc0Qofi777NcA@mail.gmail.com>
Date: Thu, 11 Sep 2025 15:56:33 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Mario Limonciello <superm1@...nel.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Lukas Wunner <lukas@...ner.de>,
Bjorn Helgaas <helgaas@...nel.org>, Alan Stern <stern@...land.harvard.edu>,
linux-pci@...r.kernel.org, linux-pm@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
Oleksij Rempel <o.rempel@...gutronix.de>, Timo Jyrinki <timo.jyrinki@....fi>,
Ernst Persson <ernstp@...il.com>, Steven Harms <sjharms@...il.com>, James Ettle <james@...le.org.uk>,
Nick Coghlan <ncoghlan@...il.com>, Weng Xuetian <wengxt@...il.com>,
Andrey Rahmatullin <wrar@...r.name>, Boris Barbour <boris.barbour@....fr>,
Vlastimil Zima <vlastimil.zima@...il.com>, David Banks <amoebae@...il.com>,
Michal Jaegermann <michal@...ddata.com>, Chris Moeller <kode54@...il.com>, Daniel Fraga <fragabr@...il.com>,
Javier Marcet <jmarcet@...il.com>, Pavel Pisa <pisa@....felk.cvut.cz>
Subject: Re: [PATCH] PCI/PM: Move ASUS EHCI workaround out of generic code
On Thu, Sep 11, 2025 at 3:46 PM Mario Limonciello <superm1@...nel.org> wrote:
>
> On 9/11/25 8:43 AM, Rafael J. Wysocki wrote:
> > On Thu, Sep 11, 2025 at 3:34 PM Mario Limonciello <superm1@...nel.org> wrote:
> >>
> >> On 9/11/25 8:11 AM, Lukas Wunner wrote:
> >>> In 2012, commit dbf0e4c7257f ("PCI: EHCI: fix crash during suspend on ASUS
> >>> computers") amended pci_pm_suspend_noirq() to work around a BIOS issue by
> >>> clearing the Command register if the suspended device is a USB EHCI host
> >>> controller.
> >>>
> >>> Commit 0b68c8e2c3af ("PCI: EHCI: Fix crash during hibernation on ASUS
> >>> computers") subsequently amended pci_pm_poweroff_noirq() to do the same.
> >>>
> >>> Two years later, commit 7d2a01b87f16 ("PCI: Add pci_fixup_suspend_late
> >>> quirk pass") introduced the ability to execute arbitrary quirks
> >>> specifically in pci_pm_suspend_noirq() and pci_pm_poweroff_noirq().
> >>>
> >>> This allows moving the ASUS workaround out of generic code and into a
> >>> proper quirk to improve maintainability and readability. Constrain to x86
> >>> since the ASUS BIOS doesn't seem to have been used on other arches.
> >>>
> >>> lspci output of affected EHCI host controllers reveals that the only bits
> >>> set in the Command register are Memory Space Enable and Bus Master Enable:
> >>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=658778
> >>>
> >>> The latter is cleared by:
> >>> hcd_pci_suspend()
> >>> suspend_common()
> >>> pci_disable_device()
> >>>
> >>> pci_disable_device() does not clear I/O and Memory Space Enable, although
> >>> its name suggests otherwise.
> >>
> >> That was my gut reaction as well.
> >>
> >>> The kernel has never disabled these bits
> >>> once they're enabled. Doing so would avoid the need for the quirk, but it
> >>> is unclear what will break if this fundamental behavior is changed.
> >>>
> >>
> >> It's too late for this cycle to do so, but how would you feel about
> >> making this change at the start of the next cycle so it had a whole
> >> cycle to bake in linux-next and see if there is a problem in doing so?
> >
> > One cycle in linux-next may not be sufficient I'm afraid because
> > linux-next is not tested on the majority of systems running Linux.
> >
> > We'd probably learn about the breakage from distro vendors.
> >
> >> If there is it could certainly be moved back to a quirk.
> >
> > Most likely, it would work on the majority of systems, but there would
> > be a tail of systems where it would break. That tail would then need
> > to be quirked somehow and it may be worse than just one quirk we have
> > today.
>
> But is that a reason not to *try* and rid the tech debt?
>
> We could just all agree that *if* there is breakage we revert back to
> the quirk just for EHCI.
Well, it's not that simple because how much time do you want to wait?
The distro installed on the system I'm using right now ships with a
6.4-based kernel, so it potentially sees and may report breakage
introduced into the mainline 2 years ago.
Will you decide to go back to the EHCI quirk if breakage is reported 2
years after dropping it?
IMV, if a decision is made to change the pci_disable_device() behavior
in this respect, we'll need to stick to it unless the breakage is
common and overwhelming (which I don't really expect to be the case).
Powered by blists - more mailing lists