linux-kernel - Re: [PATCH 2/2] pmdommain: add support system-wide resume latency constraints

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7hbji5za76.fsf@baylibre.com>
Date: Tue, 03 Feb 2026 15:19:57 -0800
From: Kevin Hilman <khilman@...libre.com>
To: Ulf Hansson <ulf.hansson@...aro.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org,
 Dhruva Gole <d-gole@...com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] pmdommain: add support system-wide resume latency
 constraints

Ulf Hansson <ulf.hansson@...aro.org> writes:

> On Wed, 21 Jan 2026 at 02:54, Kevin Hilman (TI) <khilman@...libre.com> wrote:
>>
>> In addition to checking for CPU latency constraints when checking if
>> OK to power down a domain, also check for QoS latency constraints in
>> all devices of a domain and use that in determining the final latency
>> constraint to use for the domain.
>>
>> Since cpu_system_power_down_ok() is used for system-wide suspend, the
>> per-device constratints are only relevant if the LATENCY_SYS QoS flag
>> is set.
>>
>> Because this flag implies the latency constraint only applies to
>> system-wide suspend, also check the flag in
>> dev_update_qos_constraint(). If it is set, then the constraint is not
>> relevant for runtime PM decisions.
>>
>> Signed-off-by: Kevin Hilman (TI) <khilman@...libre.com>
>> ---
>>  drivers/pmdomain/governor.c | 42 +++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 41 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
>> index 96737abbb496..03802a859a78 100644
>> --- a/drivers/pmdomain/governor.c
>> +++ b/drivers/pmdomain/governor.c
>> @@ -31,6 +31,8 @@ static int dev_update_qos_constraint(struct device *dev, void *data)
>>                 constraint_ns = td ? td->effective_constraint_ns :
>>                                 PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
>>         } else {
>> +               enum pm_qos_flags_status flag_status;
>> +
>>                 /*
>>                  * The child is not in a domain and there's no info on its
>>                  * suspend/resume latencies, so assume them to be negligible and
>> @@ -38,7 +40,14 @@ static int dev_update_qos_constraint(struct device *dev, void *data)
>>                  * known at this point anyway).
>>                  */
>>                 constraint_ns = dev_pm_qos_read_value(dev, DEV_PM_QOS_RESUME_LATENCY);
>> -               constraint_ns *= NSEC_PER_USEC;
>> +               flag_status = dev_pm_qos_flags(dev, PM_QOS_FLAG_LATENCY_SYS);
>> +               if ((constraint_ns != PM_QOS_RESUME_LATENCY_NO_CONSTRAINT) &&
>> +                           (flag_status == PM_QOS_FLAGS_ALL)) {
>> +                       dev_dbg_once(dev, "resume-latency only for system-wide.  Ignoring.\n");
>> +                       constraint_ns = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
>> +               } else {
>> +                       constraint_ns *= NSEC_PER_USEC;
>> +               }
>>         }
>
> dev_update_qos_constraint() is called only to take into account the
> QoS constraints for the device's *children*.
>
> It looks like we should also be checking the PM_QOS_FLAG_LATENCY_SYS
> flag in default_suspend_ok() for the device in question.
>
> That said, there seems to be more places in the kernel where we should
> check the PM_QOS_FLAG_LATENCY_SYS flag, like in cpu_power_down_ok(),
> cpuidle_governor_latency_req(), etc.

OK.  But now that we've agreed to drop the userspace interface for this,
I wonder if the better approach is now to consider the flag to mean that
the latency applies to runtime PM *and* system-wide PM.   Then, without
the flag set, the latency applies *only* to runtime PM.

That approach would allow the current default behavior to stay the same,
and not require adding checks for this flag throughout the runtime code,
and only require checking for the flag in the system-wide PM paths.

>>         if (constraint_ns < *constraint_ns_p)
>> @@ -430,12 +439,43 @@ static bool cpu_system_power_down_ok(struct dev_pm_domain *pd)
>>         s64 constraint_ns = cpu_wakeup_latency_qos_limit() * NSEC_PER_USEC;
>>         struct generic_pm_domain *genpd = pd_to_genpd(pd);
>>         int state_idx = genpd->state_count - 1;
>> +       struct pm_domain_data *pdd;
>> +       s32 min_dev_latency = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
>> +       s64 min_dev_latency_ns = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS;
>> +       struct gpd_link *link;
>>
>>         if (!(genpd->flags & GENPD_FLAG_CPU_DOMAIN)) {
>>                 genpd->state_idx = state_idx;
>>                 return true;
>>         }
>>
>> +       list_for_each_entry(link, &genpd->parent_links, parent_node) {
>> +               struct generic_pm_domain *child_pd = link->child;
>> +
>> +               list_for_each_entry(pdd, &child_pd->dev_list, list_node) {
>> +                       enum pm_qos_flags_status flag_status;
>> +                       s32 dev_latency;
>> +
>> +                       dev_latency = dev_pm_qos_read_value(pdd->dev, DEV_PM_QOS_RESUME_LATENCY);
>> +                       flag_status = dev_pm_qos_flags(pdd->dev, PM_QOS_FLAG_LATENCY_SYS);
>> +                       if ((dev_latency != PM_QOS_RESUME_LATENCY_NO_CONSTRAINT) &&
>> +                           (flag_status == PM_QOS_FLAGS_ALL)) {
>> +                               dev_dbg(pdd->dev,
>> +                                       "in domain %s, has QoS system-wide resume latency=%d\n",
>> +                                       child_pd->name, dev_latency);
>> +                               if (dev_latency < min_dev_latency)
>> +                                       min_dev_latency = dev_latency;
>> +                       }
>> +               }
>
> cpu_system_power_down_ok() is at the moment only used for CPU PM
> domains. If the intent is to take into account QoS constraints for
> CPUs, we should check the QoS value for CPU-devices as well (by using
> get_cpu_device(), see cpu_power_down_ok(). For non-CPU devices
> something along the lines of the above makes sense to me.
>
> Although, please note, the above code is just walking through the
> devices in the child-domains, there is nothing checking the devices
> that belong to the current/parent-domain.

Oops, yeah.  Good catch.

> Nor are we taking child-devices into account.

Indeed... double oops.  

This makes me wonder if we have any helpers to iterate over every device
(and children) in a domain (and subdomains.)

Kevin