[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7aceeac4-1cbe-590b-c1b7-5c62375cadd6@linux.intel.com>
Date: Wed, 22 Oct 2025 16:38:50 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: Brian Norris <briannorris@...omium.org>
cc: Lukas Wunner <lukas@...ner.de>, Bjorn Helgaas <bhelgaas@...gle.com>,
LKML <linux-kernel@...r.kernel.org>, linux-pm@...r.kernel.org,
"Rafael J . Wysocki" <rafael@...nel.org>, linux-pci@...r.kernel.org
Subject: Re: [PATCH] PCI/PM: Prevent runtime suspend before devices are fully
initialized
On Tue, 21 Oct 2025, Brian Norris wrote:
> On Tue, Oct 21, 2025 at 02:27:09PM +0300, Ilpo Järvinen wrote:
> > On Mon, 20 Oct 2025, Brian Norris wrote:
> > > On Mon, Oct 20, 2025 at 06:56:41PM +0300, Ilpo Järvinen wrote:
> > > > On Fri, 17 Oct 2025, Brian Norris wrote:
> > > > > On Fri, Oct 17, 2025 at 02:49:35PM +0300, Ilpo Järvinen wrote:
> > > > > > On Fri, 17 Oct 2025, Lukas Wunner wrote:
> > > > > > > On Thu, Oct 16, 2025 at 03:53:35PM -0700, Brian Norris wrote:
> > > > > > > > PCI devices are created via pci_scan_slot() and similar, and are
> > > > > > > > promptly configured for runtime PM (pci_pm_init()). They are initially
> > > > > > > > prevented from suspending by way of pm_runtime_forbid(); however, it's
> > > > > > > > expected that user space may override this via sysfs [1].
> > > > > >
> > > > > > Is this true as pm_runtime_forbid() also increases PM usage count?
> > > > >
> > > > > Yes it's true. See below.
> > > > >
> > > > > > "void pm_runtime_forbid(struct device *dev);
> > > > > >
> > > > > > unset the power.runtime_auto flag for the device and increase its
> > > > > > usage counter (used by the /sys/devices/.../power/control interface to
> > > > > > effectively prevent the device from being power managed at run time)"
> > >
> > > I see this doc line confused you, and I can sympathize.
> > >
> > > IIUC, the parenthetical means that sysfs *uses* pm_runtime_forbid() to
> > > "effectively prevent runtime power management"; pm_runtime_forbid() does
> > > not block user space from doing anything.
> > >
> > > > > Right, but sysfs `echo auto > .../power/control` performs the inverse --
> > > > > pm_runtime_allow() -- which decrements that count.
> > > >
> > > > Fair enough, I didn't check what it does.
> > > >
> > > > IMO, the details about how the usage count behaves should be part of the
> > > > changelog as that documentation I quoted sounded like user control is
> > > > prevented when forbidden.
> > >
> > > I tried to elaborate on the API doc confusion above. But frankly, I'm
> > > not sure how best to explain runtime PM.
> > >
> > > > I see you've put this part of the explanation
> > > > into the v2 as well so I suggest you explain the usage count in the change
> > > > so it is recorded in the commit if somebody has to look at this commit
> > > > years from now.
> > >
> > > Both v1 and v2 mention that the sysfs 'power/control' file can override
> > > the kernel calling pm_runtime_forbid(). They don't mention the usage
> > > count, since that's an implementation detail IMO. (To me, the mental
> > > model works best if "usage count" (usually get()/put()) is considered
> > > mostly orthogonal to forbid()/allow()/sysfs, because "forbid()" can be
> > > overridden at any time.)
> > >
> > > This is also covered here:
> > >
> > > https://docs.kernel.org/power/runtime_pm.html#runtime-pm-initialization-device-probing-and-removal
> > >
> > > "In principle, this mechanism may also be used by the driver to
> > > effectively turn off the runtime power management of the device until
> > > the user space turns it on."
> >
> > The problem is already rooted into the function name, when a function is
> > called "forbid", anyone unfamiliar will think it really forbids
> > something. The docs just further reinforced the idea and the fact that it
> > also increments usage count.
> >
> > It is quite unexpected and feels quite illogical (for non-PM person like
> > me) that user interface then goes to reverse that usage count increase,
> > what would be the logical reason why there now are less users for it when
> > user wants to turn on PM? (I understand things are done that way, no need
> > to explain that further, but there are quite a few misleading things in
> > this entire scenario, not just that parenthesis part of the docs.)
>
> Ack. I'm definitely not defending the API names or the docs. I'm
> frustrated by the same, by how long it took me to figure out what
> everything really means, and by how the same footguns hit everyone I
> work with, and we have to write these essays to explain it every time.
>
> > > But admittedly, I find the runtime PM API surface to be enormous, and
> > > its documentation ... not very helpful to outsiders. It's pretty much
> > > written by and for PM experts. Case in point: you read and quoted the
> > > appropriate docs, but it still misled you quite a bit :(
> > >
> > > I'm more tempted to try to improve the runtime PM docs than to try to
> > > make up for them in the commit message, but maybe I can be convinced
> > > otherwise.
> >
> > My personal approach is that if something comes up during review, it
> > likely is something also the next person to look at this change (from
> > history, maybe years from now) could similarly stumbles on when trying to
> > understand the change. Thus, it often is worth to mention such things in
> > the changelog too.
>
> Yes, I like that approach too. Unfortunately in this case, it feels like
> most of the confusion stems from the existing API and docs, and not from
> anything in the description. I never even mentioned the usage count,
> because frankly, I don't think it's relevant to the forbid()/allow()
> topic. In fact, the whole bug is because forbid() does *not* guarantee
> anything from the kernel's point of view -- it doesn't really hold a
> "usage count" for functional correctness in the way get()/put() do.
>
> Regardless, I can try to insert some language. Here's an attempt at yet
> another footnote for a v3:
>
> """
> [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(),
> /sys/.../power/control, and the runtime PM usage counter can be subtle.
> It appears that the intention of pm_runtime_forbid() /
> pm_runtime_allow() is twofold:
>
> 1. Allow the user to disable runtime_pm (force device to always be
> powered on) through sysfs.
> 2. Allow the driver to start with runtime_pm disabled (device forced
> on) and user space could later enable runtime_pm.
>
> This conclusion comes from reading `Documentation/power/runtime_pm.rst`,
> specifically the section starting "The user space can effectively
> disallow".
>
> This means that while pm_runtime_forbid() does technically increase the
> runtime PM usage counter, this usage counter is not a guarantee of
> functional correctness from the kernel's point of view, because user
> space can decrease that count again via sysfs.
> """
>
> Let me know what you think.
I'm fine with that, it explains well where the problem stems from.
Thanks.
--
i.
> Or feel free to tweak my v2 message and send
> a rewrite back to me (seriously! I'd like to see an outsider's view on
> what would explain it best).
>
> (And if that footnote is useful: I think part of that could belong in
> the actual docs.)
>
> > While I'm definitely not against improvements to docs too, the changelog
> > for any patch should help to understand why the change was made. And
> > IMO, this unexpected "internal detail" related to usage count which is
> > quite significant here, if user interface wouldn't lower it, runtime PM
> > would remain forbidden as forbid() promised.
>
> I think improving the docs would reap rewards. Like I said, this is by
> far not the first time that developers have had significant problems due
> to API confusion.
>
> Brian
>
Powered by blists - more mailing lists