lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <41d5c358-e469-3757-8bfb-e88c3d187e02@linux.intel.com>
Date: Tue, 21 Oct 2025 16:18:54 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
cc: Brian Norris <briannorris@...omium.org>, Lukas Wunner <lukas@...ner.de>, 
    Bjorn Helgaas <bhelgaas@...gle.com>, LKML <linux-kernel@...r.kernel.org>, 
    linux-pm@...r.kernel.org, linux-pci@...r.kernel.org
Subject: Re: [PATCH] PCI/PM: Prevent runtime suspend before devices are fully
 initialized

On Tue, 21 Oct 2025, Rafael J. Wysocki wrote:
> On Tue, Oct 21, 2025 at 1:27 PM Ilpo Järvinen
> <ilpo.jarvinen@...ux.intel.com> wrote:
> > On Mon, 20 Oct 2025, Brian Norris wrote:
> > > On Mon, Oct 20, 2025 at 06:56:41PM +0300, Ilpo Järvinen wrote:
> > > > On Fri, 17 Oct 2025, Brian Norris wrote:
> > > >
> > > > > On Fri, Oct 17, 2025 at 02:49:35PM +0300, Ilpo Järvinen wrote:
> > > > > > On Fri, 17 Oct 2025, Lukas Wunner wrote:
> > > > > >
> > > > > > > [cc += Ilpo]
> > > > > > >
> > > > > > > On Thu, Oct 16, 2025 at 03:53:35PM -0700, Brian Norris wrote:
> > > > > > > > PCI devices are created via pci_scan_slot() and similar, and are
> > > > > > > > promptly configured for runtime PM (pci_pm_init()). They are initially
> > > > > > > > prevented from suspending by way of pm_runtime_forbid(); however, it's
> > > > > > > > expected that user space may override this via sysfs [1].
> > > > > >
> > > > > > Is this true as pm_runtime_forbid() also increases PM usage count?
> > > > >
> > > > > Yes it's true. See below.
> > > > >
> > > > > > "void pm_runtime_forbid(struct device *dev);
> > > > > >
> > > > > > unset the power.runtime_auto flag for the device and increase its
> > > > > > usage counter (used by the /sys/devices/.../power/control interface to
> > > > > > effectively prevent the device from being power managed at run time)"
> > >
> > > I see this doc line confused you, and I can sympathize.
> > >
> > > IIUC, the parenthetical means that sysfs *uses* pm_runtime_forbid() to
> > > "effectively prevent runtime power management"; pm_runtime_forbid() does
> > > not block user space from doing anything.
> > >
> > > > > Right, but sysfs `echo auto > .../power/control` performs the inverse --
> > > > > pm_runtime_allow() -- which decrements that count.
> > > >
> > > > Fair enough, I didn't check what it does.
> > > >
> > > > IMO, the details about how the usage count behaves should be part of the
> > > > changelog as that documentation I quoted sounded like user control is
> > > > prevented when forbidden.
> > >
> > > I tried to elaborate on the API doc confusion above. But frankly, I'm
> > > not sure how best to explain runtime PM.
> > >
> > > > I see you've put this part of the explanation
> > > > into the v2 as well so I suggest you explain the usage count in the change
> > > > so it is recorded in the commit if somebody has to look at this commit
> > > > years from now.
> > >
> > > Both v1 and v2 mention that the sysfs 'power/control' file can override
> > > the kernel calling pm_runtime_forbid(). They don't mention the usage
> > > count, since that's an implementation detail IMO. (To me, the mental
> > > model works best if "usage count" (usually get()/put()) is considered
> > > mostly orthogonal to forbid()/allow()/sysfs, because "forbid()" can be
> > > overridden at any time.)
> > >
> > > This is also covered here:
> > >
> > > https://docs.kernel.org/power/runtime_pm.html#runtime-pm-initialization-device-probing-and-removal
> > >
> > > "In principle, this mechanism may also be used by the driver to
> > > effectively turn off the runtime power management of the device until
> > > the user space turns it on."
> >
> > The problem is already rooted into the function name, when a function is
> > called "forbid", anyone unfamiliar will think it really forbids
> > something.
> 
> And it does, until the "allow" counterpart of it is called.
> 
> The confusing part here is that the "allow" counterpart is called from
> a sysfs attribute.

Yes it is but the fact that allow then reduces usage count too even more 
so. I understand it's necessary for allowing the functionality but I hope 
you can see how illogical it sounds that usage suddenly is less because of 
an user action through sysfs, it just defies normal reasoning (no offense 
meant in any way to anyone :-)).

> > The docs just further reinforced the idea and the fact that it
> > also increments usage count.
> >
> > It is quite unexpected and feels quite illogical (for non-PM person like
> > me) that user interface then goes to reverse that usage count increase,
> > what would be the logical reason why there now are less users for it when
> > user wants to turn on PM? (I understand things are done that way, no need
> > to explain that further, but there are quite a few misleading things in
> > this entire scenario, not just that parenthesis part of the docs.)
> 
> So the purpose of this "forbid" call in pci_pm_init() is to "block"
> runtime PM for PCI devices by default, but allow user space to
> "unblock" it later.
> 
> Would adding a comment to that effect next to that call be useful?

It would be useful to improve the wording in PM documentation which is too 
ambiguous. I suggest changing this:

"void pm_runtime_forbid(struct device *dev);

unset the power.runtime_auto flag for the device and increase its
usage counter (used by the /sys/devices/.../power/control interface to 
effectively prevent the device from being power managed at run time).

to:

"... (used to prevent the device from being power managed at run time 
until pm_runtime_allow() or /sys/devices/.../power/control interface 
allows it)."


I have to admit I'd still end up thinking usage count remains 1 up at this 
point (if not knowing better like I do now) but at least it would
unambiguously tells what the interface does.


My another point related to whether that confusing usage count decrement 
(solely because of user action) is worth describing in the changelog of 
this patch (or v2 of it which follows your suggestion on moving enable 
locatoon). IMO, it's odd/confusing enough a note would be warranted.
I know it after this discussion, obviously, but I worry over others who 
trip over the same thing if somebody else has to look over this change.


-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ