lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <06cd0121-819d-652d-afa7-eece15bf82a2@linux.intel.com>
Date: Tue, 21 Oct 2025 14:27:09 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Brian Norris <briannorris@...omium.org>
cc: Lukas Wunner <lukas@...ner.de>, Bjorn Helgaas <bhelgaas@...gle.com>, 
    LKML <linux-kernel@...r.kernel.org>, linux-pm@...r.kernel.org, 
    "Rafael J . Wysocki" <rafael@...nel.org>, linux-pci@...r.kernel.org
Subject: Re: [PATCH] PCI/PM: Prevent runtime suspend before devices are fully
 initialized

On Mon, 20 Oct 2025, Brian Norris wrote:

> Hi Ilpo,
> 
> On Mon, Oct 20, 2025 at 06:56:41PM +0300, Ilpo Järvinen wrote:
> > On Fri, 17 Oct 2025, Brian Norris wrote:
> > 
> > > On Fri, Oct 17, 2025 at 02:49:35PM +0300, Ilpo Järvinen wrote:
> > > > On Fri, 17 Oct 2025, Lukas Wunner wrote:
> > > > 
> > > > > [cc += Ilpo]
> > > > > 
> > > > > On Thu, Oct 16, 2025 at 03:53:35PM -0700, Brian Norris wrote:
> > > > > > PCI devices are created via pci_scan_slot() and similar, and are
> > > > > > promptly configured for runtime PM (pci_pm_init()). They are initially
> > > > > > prevented from suspending by way of pm_runtime_forbid(); however, it's
> > > > > > expected that user space may override this via sysfs [1].
> > > > 
> > > > Is this true as pm_runtime_forbid() also increases PM usage count?
> > > 
> > > Yes it's true. See below.
> > > 
> > > > "void pm_runtime_forbid(struct device *dev);
> > > > 
> > > > unset the power.runtime_auto flag for the device and increase its 
> > > > usage counter (used by the /sys/devices/.../power/control interface to 
> > > > effectively prevent the device from being power managed at run time)"
> 
> I see this doc line confused you, and I can sympathize.
> 
> IIUC, the parenthetical means that sysfs *uses* pm_runtime_forbid() to
> "effectively prevent runtime power management"; pm_runtime_forbid() does
> not block user space from doing anything.
>
> > > Right, but sysfs `echo auto > .../power/control` performs the inverse --
> > > pm_runtime_allow() -- which decrements that count.
> > 
> > Fair enough, I didn't check what it does.
> > 
> > IMO, the details about how the usage count behaves should be part of the 
> > changelog as that documentation I quoted sounded like user control is 
> > prevented when forbidden.
> 
> I tried to elaborate on the API doc confusion above. But frankly, I'm
> not sure how best to explain runtime PM.
> 
> > I see you've put this part of the explanation 
> > into the v2 as well so I suggest you explain the usage count in the change 
> > so it is recorded in the commit if somebody has to look at this commit 
> > years from now.
> 
> Both v1 and v2 mention that the sysfs 'power/control' file can override
> the kernel calling pm_runtime_forbid(). They don't mention the usage
> count, since that's an implementation detail IMO. (To me, the mental
> model works best if "usage count" (usually get()/put()) is considered
> mostly orthogonal to forbid()/allow()/sysfs, because "forbid()" can be
> overridden at any time.)
> 
> This is also covered here:
> 
> https://docs.kernel.org/power/runtime_pm.html#runtime-pm-initialization-device-probing-and-removal
> 
> "In principle, this mechanism may also be used by the driver to
> effectively turn off the runtime power management of the device until
> the user space turns it on."

The problem is already rooted into the function name, when a function is 
called "forbid", anyone unfamiliar will think it really forbids 
something. The docs just further reinforced the idea and the fact that it 
also increments usage count.

It is quite unexpected and feels quite illogical (for non-PM person like 
me) that user interface then goes to reverse that usage count increase, 
what would be the logical reason why there now are less users for it when 
user wants to turn on PM? (I understand things are done that way, no need 
to explain that further, but there are quite a few misleading things in 
this entire scenario, not just that parenthesis part of the docs.)

> But admittedly, I find the runtime PM API surface to be enormous, and
> its documentation ... not very helpful to outsiders. It's pretty much
> written by and for PM experts. Case in point: you read and quoted the
> appropriate docs, but it still misled you quite a bit :(
> 
> I'm more tempted to try to improve the runtime PM docs than to try to
> make up for them in the commit message, but maybe I can be convinced
> otherwise.

My personal approach is that if something comes up during review, it 
likely is something also the next person to look at this change (from 
history, maybe years from now) could similarly stumbles on when trying to 
understand the change. Thus, it often is worth to mention such things in 
the changelog too.

While I'm definitely not against improvements to docs too, the changelog 
for any patch should help to understand why the change was made. And 
IMO, this unexpected "internal detail" related to usage count which is 
quite significant here, if user interface wouldn't lower it, runtime PM 
would remain forbidden as forbid() promised.

> > > > > Patch LGTM in principle, but adding Ilpo to cc who is refactoring PCI
> > > > > resource allocation and may judge whether this can actually happen.
> > > > 
> > > > I can see there could be other failure modes besides just saving wrong 
> > > > config if devices get suspended underneath the resource assignment 
> > > > algorithm.
> > > > 
> > > > Besides hotplug, also BAR resize does changes the resources and BARs.
> > > > This case is not helped by this patch.
> > > 
> > > Is that the 'resource_N_resize' sysfs attributes? Becuase that holds PM
> > > references (pci_config_pm_runtime_{get,put}()) and therefore should not
> > > generally have the same problem.
> > 
> > Okay, seem fine for the PCI core part.
> > 
> > Driver's can also trigger BAR resize by calling pci_resize_resource() 
> > directly but I've no idea how the usage counts behave (TBH, PM isn't my 
> > strongest forte even if Lukas pulled me in to comment).
> 
> There are only 3 drivers that call pci_resize_resource(). I looked into
> them, and it looks like they all hold pm_runtime_* references while
> doing this, or are calling it during pci_driver probe(), which docs say
> will hold a reference.
> 
> https://docs.kernel.org/power/pci.html#device-runtime-power-management
> 
> So they should all be OK too.

Thanks for checking.

-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ