[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aO15eFW430nuXMa5@google.com>
Date: Mon, 13 Oct 2025 15:13:12 -0700
From: Brian Norris <briannorris@...omium.org>
To: Manivannan Sadhasivam <mani@...nel.org>
Cc: Bjorn Helgaas <helgaas@...nel.org>,
Mika Westerberg <mika.westerberg@...ux.intel.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH] PCI/PM: Avoid redundant delays on D3hot->D3cold
On Mon, Oct 06, 2025 at 04:13:26PM -0700, Manivannan Sadhasivam wrote:
> On Mon, Oct 06, 2025 at 02:33:33PM -0500, Bjorn Helgaas wrote:
> > On Mon, Oct 06, 2025 at 11:32:38AM -0700, Brian Norris wrote:
> > > Some PCI drivers call pci_set_power_state(..., PCI_D3hot) on their own
> > > when preparing for runtime or system suspend, so by the time they hit
> > > pci_finish_runtime_suspend(), they're in D3hot. Then, pci_target_state()
> > > may still pick a lower state (D3cold).
> >
> > We might need this change, but maybe this is also an opportunity to
> > remove some of those pci_set_power_state(..., PCI_D3hot) calls from
> > drivers.
> >
>
> Agree. The PCI client drivers should have no business in opting for D3Hot in the
> suspend path.
I dunno. There are various reasons a device might want to go to D3Hot
some time before fully suspending the system, and possibly even before
runtime suspend (or they may not support runtime PM at all). For
example, on the first step on my alphabetical trawl through
git grep -l '\<pci_set_power_state\>' drivers/
I found a driver that supports some power-toggling via debugfs, in
drivers/accel/habanalabs/common/debugfs.c. It would take nontrivial
effort to evaluate every case like that for removal.
BTW, we even have documentation for this:
https://docs.kernel.org/power/pci.html#suspend
"However, in some rare case it is convenient to carry out these operations in
a PCI driver. Then, pci_save_state(), pci_prepare_to_sleep(), and
pci_set_power_state() should be used to save the device's standard configuration
registers, to prepare it for system wakeup (if necessary), and to put it into a
low-power state, respectively."
So sure, it should be rare (like the docs say), and it's probably
redundant in many cases, but I'm not that interested in shaving various
drivers' yaks right now. I'm just fixing a (small) performance
regression in documented behavior.
> It should be the other way around, they should opt-out if they
> want by calling pci_save_state(), but that is also subject to discussion.
FWIW, that's also documented in the above link.
Brian
Powered by blists - more mailing lists