[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPH_B7SiJ8KnIAwJ@wunner.de>
Date: Fri, 17 Oct 2025 10:32:07 +0200
From: Lukas Wunner <lukas@...ner.de>
To: Brian Norris <briannorris@...omium.org>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-kernel@...r.kernel.org,
linux-pm@...r.kernel.org, "Rafael J . Wysocki" <rafael@...nel.org>,
linux-pci@...r.kernel.org,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Subject: Re: [PATCH] PCI/PM: Prevent runtime suspend before devices are fully
initialized
[cc += Ilpo]
On Thu, Oct 16, 2025 at 03:53:35PM -0700, Brian Norris wrote:
> PCI devices are created via pci_scan_slot() and similar, and are
> promptly configured for runtime PM (pci_pm_init()). They are initially
> prevented from suspending by way of pm_runtime_forbid(); however, it's
> expected that user space may override this via sysfs [1].
>
> Now, sometime after initial scan, a PCI device receives its BAR
> configuration (pci_assign_unassigned_bus_resources(), etc.).
>
> If a PCI device is allowed to suspend between pci_scan_slot() and
> pci_assign_unassigned_bus_resources(), then pci-driver.c will
> save/restore incorrect BAR configuration for the device, and the device
> may cease to function.
>
> This behavior races with user space, since user space may enable runtime
> PM [1] as soon as it sees the device, which may be before BAR
> configuration.
>
> Prevent suspending in this intermediate state by holding a runtime PM
> reference until the device is fully initialized and ready for probe().
Not sure if that is comprehensible by everybody. The point is that
unbound devices are left in D0 but are nevertheless allowed to
(logically) runtime suspend. And pci_pm_runtime_suspend() may call
pci_save_state() while config space isn't fully initialized yet,
or pci_pm_runtime_resume() may call pci_restore_state() (via
pci_pm_default_resume_early()) and overwrite initialized config space
with uninitialized data.
Have you actually seen this happen in practice? Normally enumeration
happens during subsys_initcall time, when user space isn't running yet.
Hotplug may be an exception though.
Patch LGTM in principle, but adding Ilpo to cc who is refactoring PCI
resource allocation and may judge whether this can actually happen.
I think the code comments you're adding are a little verbose and a simple
/* acquired in pci_pm_init() */ in pci_bus_add_device() may be sufficient.
Also, I think it is neither necessary nor useful to actually cc the e-mail
to stable@...r.kernel.org if you include a stable designation in the
patch. I believe stable maintainers only pick up backports from that list,
not patches intended for upstream.
Thanks,
Lukas
Powered by blists - more mailing lists