[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFoPzFqbXo1nD_6r-mW+OYUHSdH+P19ov-0kjb8znLYtYw@mail.gmail.com>
Date: Fri, 31 Oct 2025 11:19:15 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: linux-pm@...r.kernel.org, Vincent Guittot <vincent.guittot@...aro.org>,
Peter Zijlstra <peterz@...radead.org>, Kevin Hilman <khilman@...libre.com>,
Pavel Machek <pavel@...nel.org>, Len Brown <len.brown@...el.com>,
Daniel Lezcano <daniel.lezcano@...aro.org>, Saravana Kannan <saravanak@...gle.com>,
Maulik Shah <quic_mkshah@...cinc.com>, Prasad Sodagudi <psodagud@...cinc.com>,
Dhruva Gole <d-gole@...com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/4] pmdomain: Respect the CPU system-wakeup QoS limit
during s2idle
On Thu, 30 Oct 2025 at 19:11, Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Thu, Oct 30, 2025 at 4:07 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> >
> > On Thu, 30 Oct 2025 at 15:02, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > >
> > > On Thu, Oct 30, 2025 at 1:32 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > >
> > > > On Thu, 30 Oct 2025 at 13:23, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > >
> > > > > On Thu, Oct 30, 2025 at 1:00 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > > > >
> > > > > > On Thu, 30 Oct 2025 at 11:45, Rafael J. Wysocki <rafael@...nel.org> wrote:
> > > > > > >
> > > > > > > On Thu, Oct 16, 2025 at 5:19 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > > > > > >
> > > > > > > > A CPU system-wakeup QoS limit may have been requested by user-space. To
> > > > > > > > avoid breaking this constraint when entering a low-power state during
> > > > > > > > s2idle through genpd, let's extend the corresponding genpd governor for
> > > > > > > > CPUs. More precisely, during s2idle let the genpd governor select a
> > > > > > > > suitable low-power state, by taking into account the QoS limit.
> > > > > > > >
> > > > > > > > Signed-off-by: Ulf Hansson <ulf.hansson@...aro.org>
> > > > > > > > ---
> > > > > > > >
> > > > > > > > Changes in v2:
> > > > > > > > - Limite the change to the genpd governor for CPUs.
> > > > > > > >
> > > > > > > > ---
> > > > > > > > drivers/pmdomain/core.c | 10 ++++++++--
> > > > > > > > drivers/pmdomain/governor.c | 27 +++++++++++++++++++++++++++
> > > > > > > > include/linux/pm_domain.h | 1 +
> > > > > > > > 3 files changed, 36 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
> > > > > > > > index 61c2277c9ce3..4fd546ef0448 100644
> > > > > > > > --- a/drivers/pmdomain/core.c
> > > > > > > > +++ b/drivers/pmdomain/core.c
> > > > > > > > @@ -1425,8 +1425,14 @@ static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock,
> > > > > > > > return;
> > > > > > > > }
> > > > > > > >
> > > > > > > > - /* Choose the deepest state when suspending */
> > > > > > > > - genpd->state_idx = genpd->state_count - 1;
> > > > > > > > + if (genpd->gov && genpd->gov->system_power_down_ok) {
> > > > > > > > + if (!genpd->gov->system_power_down_ok(&genpd->domain))
> > > > > > > > + return;
> > > > > > > > + } else {
> > > > > > > > + /* Default to the deepest state. */
> > > > > > > > + genpd->state_idx = genpd->state_count - 1;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > > if (_genpd_power_off(genpd, false)) {
> > > > > > > > genpd->states[genpd->state_idx].rejected++;
> > > > > > > > return;
> > > > > > > > diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
> > > > > > > > index 39359811a930..bd1b9d66d4a5 100644
> > > > > > > > --- a/drivers/pmdomain/governor.c
> > > > > > > > +++ b/drivers/pmdomain/governor.c
> > > > > > > > @@ -415,9 +415,36 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
> > > > > > > > return false;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static bool cpu_system_power_down_ok(struct dev_pm_domain *pd)
> > > > > > > > +{
> > > > > > > > + s64 constraint_ns = cpu_wakeup_latency_qos_limit() * NSEC_PER_USEC;
> > > > > > >
> > > > > > > I'm not sure why genpd needs to take cpu_wakeup_latency_qos_limit()
> > > > > > > into account directly.
> > > > > > >
> > > > > > > It should be told by cpuidle which state has been selected on the CPU
> > > > > > > side and it should not go any deeper than that anyway.
> > > > > >
> > > > > > For PSCI OS-initiated mode, cpuidle doesn't know about the states that
> > > > > > may be shared among a group of CPUs.
> > > > > >
> > > > > > Instead, those states are controlled through the PM domain topology by
> > > > > > genpd and its governor, hence this is needed too.
> > > > >
> > > > > All right, but I'd like to understand how all of that works.
> > > > >
> > > > > So cpuidle selects a state to enter for the given CPU and then genpd
> > > > > is invoked. It has to take the exit latency of that state into
> > > > > account, so it doesn't go too deep. How does it do that?
> > > >
> > > > Depending on the state selected, in cpuidle-psci.c we may end up
> > > > calling __psci_enter_domain_idle_state() (only for the deepest
> > > > CPU-state).
> > > >
> > > > For s2idle this means we call dev_pm_genpd_suspend|resume(), to manage
> > > > the reference counting of the PM domains via genpd. This then may lead
> > > > to that genpd_sync_power_off() tries to select a state by calling the
> > > > new governor function above.
> > > >
> > > > Did that make sense?
> > >
> > > So IIUC this will only happen if the deepest idle state is selected in
> > > which case the cpu_wakeup_latency_qos_limit() value is greater than
> > > the exit latency of that state, but it may still need to be taken into
> > > account when selecting the domain state. However, this means that the
> >
> > Correct.
> >
> > > exit latency number for the deepest idle state is too low (it should
> > > represent the worst-case exit latency which means the maximum domain
> > > exit latency in this particular case).
> >
> > Yes, from the cpuidle state-selection point of view, but how is that a problem?
>
> It is confusing. Otherwise, for s2idle, I guess it is not a big deal.
>
> I guess what happens is that genpd has a range of states with
> different latency values to choose from and it is not practical to
> expose all of them as CPU idle states, so you end up exposing just one
> of them with the lowest latency value to allow cpuidle to involve
> genpd often enough.
Yes, the states that are CPU specific are exposed to CPU-idle.
The states that are shared with other CPUs are managed by genpd,
because those need reference counting. Not even limited to CPUs.
>
> If that's the case, I'd make a note of that somewhere if I were you,
> or people will routinely get confused by it.
Documentation is always nice.
We have DT docs and the PSCI spec, but we lack proper documentation of
the whole genpd interface. Actually, I have started working on
documentation for genpd, but haven't reached the point of submitting a
patch for it.
>
> > If the genpd-governor doesn't find a suitable "domain-idle-state", we
> > fallback to using the one cpuidle selected.
> >
> > >
> > > Moreover, it looks like the "runtime" cpuidle has the same problem, doesn't it?
> >
> > It works in a very similar way, but I fail to understand why you think
> > there is a problem.
>
> There is a problem because it may violate a "runtime" latency constraint.
>
> Say you expose 2 CPU idle states, a shallow one and a genpd one. The
> advertised exit latency of the genpd state is X and the current
> latency constraint is Y > X. The genpd state is selected and genpd
> doesn't look at the cpuidle_governor_latency_req() return value, so it
> chooses a real state with exit latency Z > Y.
>
> To a minimum, genpd should be made aware of
> cpuidle_governor_latency_req(), but even then cpuidle governors take
> exit latency into consideration in their computations, so things may
> get confused somewhat.
Please have a look at cpu_power_down_ok(), which is the function that
runs to select the domain-idle-state. It does take the constraints
into account during runtime, even it doesn't call
cpuidle_governor_latency_req() explicitly.
Kind regards
Uffe
Powered by blists - more mailing lists