[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251031110106.62394-3-ulf.hansson@linaro.org>
Date: Fri, 31 Oct 2025 12:00:58 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: "Rafael J . Wysocki" <rafael@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Mark Rutland <mark.rutland@....com>,
Marc Zyngier <maz@...nel.org>,
Maulik Shah <quic_mkshah@...cinc.com>,
Sudeep Holla <sudeep.holla@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ben Horgan <ben.horgan@....com>,
linux-pm@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
Ulf Hansson <ulf.hansson@...aro.org>
Subject: [PATCH v3 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs
When the genpd governor for CPUs, tries to select the most optimal idle
state for a group of CPUs managed in a PM domain, it fails far too often.
On a Dragonboard 410c, which is an arm64 based platform with 4 CPUs in one
cluster that is using PSCI OS-initiated mode, we can observe that we often
fail when trying to enter the selected idle state. This is certainly a
suboptimal behaviour that leads to many unnecessary requests being sent to
the PSCI FW.
A simple dd operation that reads from the eMMC, to generate some IRQs and
I/O handling helps us to understand the problem, while also monitoring the
rejected counters in debugfs for the corresponding idle states of the genpd
in question.
Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 1451 437 91 149 0
S1 65194 558 149 172 0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.562698 seconds, 140.3MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 2694 1073 265 892 1
S1 74567 829 561 790 0
The dd completed in ~3.6 seconds and rejects increased with 586.
Teo governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 4976 2096 392 1721 2
S1 160661 1893 1309 1904 0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.543225 seconds, 141.1MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 5192 2194 433 1830 2
S1 167677 2891 3184 4729 0
The dd completed in ~3.6 seconds and rejects increased with 1916.
The main reason to the above problem is pending IPIs for one of the CPUs
that is affected by the idle state that the genpd governor selected. This
leads to that the PSCI FW refuses to enter it. To improve the behaviour,
let's start to take into account pending IPIs for CPUs in the genpd
governor, hence we fallback to use the shallower per CPU idle state.
Re-testing with this change shows a significant improved behaviour.
- Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 2556 878 19 368 1
S1 69974 596 10 152 0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.522010 seconds, 142.0MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 3360 1320 28 819 1
S1 70168 710 11 267 0
The dd completed in ~3.5 seconds and rejects increased with 10.
- Teo governor
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 5145 1861 39 938 1
S1 188887 3117 51 1975 0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.653100 seconds, 136.9MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State Time Spent(ms) Usage Rejected Above Below
S0 5260 1923 42 1002 1
S1 190849 4033 52 2892 0
The dd completed in ~3.7 seconds and rejects increased with 4.
Note that, the rejected counters in genpd are also being accumulated in the
rejected counters that are managed by cpuidle, yet on a per CPU idle states
basis. Comparing these counters before/after this change, through cpuidle's
sysfs interface shows the similar improvements.
Signed-off-by: Ulf Hansson <ulf.hansson@...aro.org>
---
Changes in v3:
- Use the new name of the helper function.
- Minor updates to the commit message.
Changes in v2:
- Use the new name of the helper function.
- Re-test and update the statistics in the commit message.
---
drivers/pmdomain/governor.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
index 39359811a930..a46470f2261a 100644
--- a/drivers/pmdomain/governor.c
+++ b/drivers/pmdomain/governor.c
@@ -404,15 +404,21 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
if ((idle_duration_ns >= (genpd->states[i].residency_ns +
genpd->states[i].power_off_latency_ns)) &&
(global_constraint >= (genpd->states[i].power_on_latency_ns +
- genpd->states[i].power_off_latency_ns))) {
- genpd->state_idx = i;
- genpd->gd->last_enter = now;
- genpd->gd->reflect_residency = true;
- return true;
- }
+ genpd->states[i].power_off_latency_ns)))
+ break;
+
} while (--i >= 0);
- return false;
+ if (i < 0)
+ return false;
+
+ if (cpus_peek_for_pending_ipi(genpd->cpus))
+ return false;
+
+ genpd->state_idx = i;
+ genpd->gd->last_enter = now;
+ genpd->gd->reflect_residency = true;
+ return true;
}
struct dev_power_governor pm_domain_cpu_gov = {
--
2.43.0
Powered by blists - more mailing lists