linux-kernel - Re: [PATCH 2/2] pmdommain: add support system-wide resume latency constraints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPDyKFoKhnt-vz-vYO3Tnr93pe3b_2md9HL5XKo8MOX9_tkqnQ@mail.gmail.com>
Date: Wed, 4 Feb 2026 14:33:02 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Kevin Hilman <khilman@...libre.com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org, 
	Dhruva Gole <d-gole@...com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] pmdommain: add support system-wide resume latency constraints

On Wed, 4 Feb 2026 at 00:19, Kevin Hilman <khilman@...libre.com> wrote:
>
> Ulf Hansson <ulf.hansson@...aro.org> writes:
>
> > On Wed, 21 Jan 2026 at 02:54, Kevin Hilman (TI) <khilman@...libre.com> wrote:
> >>
> >> In addition to checking for CPU latency constraints when checking if
> >> OK to power down a domain, also check for QoS latency constraints in
> >> all devices of a domain and use that in determining the final latency
> >> constraint to use for the domain.
> >>
> >> Since cpu_system_power_down_ok() is used for system-wide suspend, the
> >> per-device constratints are only relevant if the LATENCY_SYS QoS flag
> >> is set.
> >>
> >> Because this flag implies the latency constraint only applies to
> >> system-wide suspend, also check the flag in
> >> dev_update_qos_constraint(). If it is set, then the constraint is not
> >> relevant for runtime PM decisions.
> >>
> >> Signed-off-by: Kevin Hilman (TI) <khilman@...libre.com>
> >> ---
> >>  drivers/pmdomain/governor.c | 42 +++++++++++++++++++++++++++++++++++++++++-
> >>  1 file changed, 41 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
> >> index 96737abbb496..03802a859a78 100644
> >> --- a/drivers/pmdomain/governor.c
> >> +++ b/drivers/pmdomain/governor.c
> >> @@ -31,6 +31,8 @@ static int dev_update_qos_constraint(struct device *dev, void *data)
> >>                 constraint_ns = td ? td->effective_constraint_ns :
> >>                                 PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
> >>         } else {
> >> +               enum pm_qos_flags_status flag_status;
> >> +
> >>                 /*
> >>                  * The child is not in a domain and there's no info on its
> >>                  * suspend/resume latencies, so assume them to be negligible and
> >> @@ -38,7 +40,14 @@ static int dev_update_qos_constraint(struct device *dev, void *data)
> >>                  * known at this point anyway).
> >>                  */
> >>                 constraint_ns = dev_pm_qos_read_value(dev, DEV_PM_QOS_RESUME_LATENCY);
> >> -               constraint_ns *= NSEC_PER_USEC;
> >> +               flag_status = dev_pm_qos_flags(dev, PM_QOS_FLAG_LATENCY_SYS);
> >> +               if ((constraint_ns != PM_QOS_RESUME_LATENCY_NO_CONSTRAINT) &&
> >> +                           (flag_status == PM_QOS_FLAGS_ALL)) {
> >> +                       dev_dbg_once(dev, "resume-latency only for system-wide.  Ignoring.\n");
> >> +                       constraint_ns = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
> >> +               } else {
> >> +                       constraint_ns *= NSEC_PER_USEC;
> >> +               }
> >>         }
> >
> > dev_update_qos_constraint() is called only to take into account the
> > QoS constraints for the device's *children*.
> >
> > It looks like we should also be checking the PM_QOS_FLAG_LATENCY_SYS
> > flag in default_suspend_ok() for the device in question.
> >
> > That said, there seems to be more places in the kernel where we should
> > check the PM_QOS_FLAG_LATENCY_SYS flag, like in cpu_power_down_ok(),
> > cpuidle_governor_latency_req(), etc.
>
> OK.  But now that we've agreed to drop the userspace interface for this,
> I wonder if the better approach is now to consider the flag to mean that
> the latency applies to runtime PM *and* system-wide PM.   Then, without
> the flag set, the latency applies *only* to runtime PM.
>
> That approach would allow the current default behavior to stay the same,
> and not require adding checks for this flag throughout the runtime code,
> and only require checking for the flag in the system-wide PM paths.

I agree with all of the above!

It would certainly make this less intrusive and it would also be more
consistent with what we did for CPU QoS.

>
> >>         if (constraint_ns < *constraint_ns_p)
> >> @@ -430,12 +439,43 @@ static bool cpu_system_power_down_ok(struct dev_pm_domain *pd)
> >>         s64 constraint_ns = cpu_wakeup_latency_qos_limit() * NSEC_PER_USEC;
> >>         struct generic_pm_domain *genpd = pd_to_genpd(pd);
> >>         int state_idx = genpd->state_count - 1;
> >> +       struct pm_domain_data *pdd;
> >> +       s32 min_dev_latency = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
> >> +       s64 min_dev_latency_ns = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
> >> +       struct gpd_link *link;
> >>
> >>         if (!(genpd->flags & GENPD_FLAG_CPU_DOMAIN)) {
> >>                 genpd->state_idx = state_idx;
> >>                 return true;
> >>         }
> >>
> >> +       list_for_each_entry(link, &genpd->parent_links, parent_node) {
> >> +               struct generic_pm_domain *child_pd = link->child;
> >> +
> >> +               list_for_each_entry(pdd, &child_pd->dev_list, list_node) {
> >> +                       enum pm_qos_flags_status flag_status;
> >> +                       s32 dev_latency;
> >> +
> >> +                       dev_latency = dev_pm_qos_read_value(pdd->dev, DEV_PM_QOS_RESUME_LATENCY);
> >> +                       flag_status = dev_pm_qos_flags(pdd->dev, PM_QOS_FLAG_LATENCY_SYS);
> >> +                       if ((dev_latency != PM_QOS_RESUME_LATENCY_NO_CONSTRAINT) &&
> >> +                           (flag_status == PM_QOS_FLAGS_ALL)) {
> >> +                               dev_dbg(pdd->dev,
> >> +                                       "in domain %s, has QoS system-wide resume latency=%d\n",
> >> +                                       child_pd->name, dev_latency);
> >> +                               if (dev_latency < min_dev_latency)
> >> +                                       min_dev_latency = dev_latency;
> >> +                       }
> >> +               }
> >
> > cpu_system_power_down_ok() is at the moment only used for CPU PM
> > domains. If the intent is to take into account QoS constraints for
> > CPUs, we should check the QoS value for CPU-devices as well (by using
> > get_cpu_device(), see cpu_power_down_ok(). For non-CPU devices
> > something along the lines of the above makes sense to me.
> >
> > Although, please note, the above code is just walking through the
> > devices in the child-domains, there is nothing checking the devices
> > that belong to the current/parent-domain.
>
> Oops, yeah.  Good catch.
>
> > Nor are we taking child-devices into account.
>
> Indeed... double oops.
>
> This makes me wonder if we have any helpers to iterate over every device
> (and children) in a domain (and subdomains.)

Unfortunately there isn't, but it's a good idea I think.

If you decide to add helpers for this, please define them in a new
header-file internally for genpd, in drivers/pmdomain/core.h, so they
don't get publicly available via include/linux/pm_domain.h.

Kind regards
Uffe