[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250317060357.29451-1-aboorvad@linux.ibm.com>
Date: Mon, 17 Mar 2025 11:33:57 +0530
From: Aboorva Devarajan <aboorvad@...ux.ibm.com>
To: "Rafael J . Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Christian Loehle <christian.loehle@....com>,
Gautam Menghani <gautam@...ux.ibm.com>,
Aboorva Devarajan <aboorvad@...ux.ibm.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [PATCH v2 1/1] cpuidle: menu: Prefer polling state for short idle durations
Avoid selecting deep idle state when the predicted idle duration is
shorter than its target residency, as this leads to unnecessary state
transitions without energy savings.
On virtualized PowerPC (pseries) systems, where only one polling state
(Snooze) and one deep state (CEDE) are available, selecting CEDE when
its target residency exceeds the predicted idle duration hurts
performance.
For example, if the predicted idle duration is 15 us and the first
non-polling state has a target residency of 120 us, selecting it
would be suboptimal.
Remove the condition introduced in commit 69d25870f20c
("cpuidle: fix the menu governor to boost IO performance") that
prioritized non-polling states even when their target residency
exceeded the predicted idle duration and allow polling states to
be selected when appropriate.
Performance improvement observed with pgbench on PowerPC (pseries)
system:
+---------------------------+------------+------------+------------+
| Metric | Baseline | Patched | Change (%) |
+---------------------------+------------+------------+------------+
| Transactions/sec (TPS) | 494,834 | 538,707 | +8.85% |
| Avg latency (ms) | 0.162 | 0.149 | -8.02% |
+---------------------------+------------+------------+------------+
CPUIdle state usage:
+--------------+--------------+-------------+
| Metric | Baseline | Patched |
+--------------+--------------+-------------+
| Total usage | 12,703,630 | 13,941,966 |
| Above usage | 11,388,990 | 1,620,474 |
| Below usage | 19,973 | 684,708 |
+--------------+--------------+-------------+
Above/Total and Below/Total usage percentages:
+------------------------+-----------+---------+
| Metric | Baseline | Patched |
+------------------------+-----------+---------+
| Above % (Above/Total) | 89.67% | 11.63% |
| Below % (Below/Total) | 0.16% | 4.91% |
| Total cpuidle miss (%) | 89.83% | 16.54% |
+------------------------+-----------+---------+
Signed-off-by: Aboorva Devarajan <aboorvad@...ux.ibm.com>
---
v1: https://lore.kernel.org/all/20240809073120.250974-1-aboorvad@linux.ibm.com/
v1 -> v2:
- Drop cover letter and improve commit message.
---
drivers/cpuidle/governors/menu.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 28363bfa3e4c..4b199377e4be 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -296,17 +296,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
idx = i; /* first enabled state */
if (s->target_residency_ns > predicted_ns) {
- /*
- * Use a physical idle state, not busy polling, unless
- * a timer is going to trigger soon enough.
- */
- if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
- s->exit_latency_ns <= latency_req &&
- s->target_residency_ns <= data->next_timer_ns) {
- predicted_ns = s->target_residency_ns;
- idx = i;
- break;
- }
if (predicted_ns < TICK_NSEC)
break;
--
2.43.5
Powered by blists - more mailing lists