[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <525a2352-dc57-45ca-adb2-f7039c37145e@arm.com>
Date: Mon, 17 Mar 2025 08:35:34 +0000
From: Christian Loehle <christian.loehle@....com>
To: Aboorva Devarajan <aboorvad@...ux.ibm.com>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Gautam Menghani <gautam@...ux.ibm.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] cpuidle: menu: Prefer polling state for short idle
durations
On 3/17/25 06:03, Aboorva Devarajan wrote:
> Avoid selecting deep idle state when the predicted idle duration is
> shorter than its target residency, as this leads to unnecessary state
> transitions without energy savings.
>
> On virtualized PowerPC (pseries) systems, where only one polling state
> (Snooze) and one deep state (CEDE) are available, selecting CEDE when
> its target residency exceeds the predicted idle duration hurts
> performance.
>
> For example, if the predicted idle duration is 15 us and the first
> non-polling state has a target residency of 120 us, selecting it
> would be suboptimal.
>
> Remove the condition introduced in commit 69d25870f20c
> ("cpuidle: fix the menu governor to boost IO performance") that
> prioritized non-polling states even when their target residency
> exceeded the predicted idle duration and allow polling states to
> be selected when appropriate.
>
> Performance improvement observed with pgbench on PowerPC (pseries)
> system:
> +---------------------------+------------+------------+------------+
> | Metric | Baseline | Patched | Change (%) |
> +---------------------------+------------+------------+------------+
> | Transactions/sec (TPS) | 494,834 | 538,707 | +8.85% |
> | Avg latency (ms) | 0.162 | 0.149 | -8.02% |
> +---------------------------+------------+------------+------------+
>
> CPUIdle state usage:
> +--------------+--------------+-------------+
> | Metric | Baseline | Patched |
> +--------------+--------------+-------------+
> | Total usage | 12,703,630 | 13,941,966 |
> | Above usage | 11,388,990 | 1,620,474 |
> | Below usage | 19,973 | 684,708 |
> +--------------+--------------+-------------+
>
> Above/Total and Below/Total usage percentages:
> +------------------------+-----------+---------+
> | Metric | Baseline | Patched |
> +------------------------+-----------+---------+
> | Above % (Above/Total) | 89.67% | 11.63% |
> | Below % (Below/Total) | 0.16% | 4.91% |
> | Total cpuidle miss (%) | 89.83% | 16.54% |
> +------------------------+-----------+---------+
>
> Signed-off-by: Aboorva Devarajan <aboorvad@...ux.ibm.com>
>
> ---
>
> v1: https://lore.kernel.org/all/20240809073120.250974-1-aboorvad@linux.ibm.com/
>
> v1 -> v2:
>
> - Drop cover letter and improve commit message.
> ---
> drivers/cpuidle/governors/menu.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
> index 28363bfa3e4c..4b199377e4be 100644
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -296,17 +296,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
> idx = i; /* first enabled state */
>
> if (s->target_residency_ns > predicted_ns) {
> - /*
> - * Use a physical idle state, not busy polling, unless
> - * a timer is going to trigger soon enough.
> - */
> - if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> - s->exit_latency_ns <= latency_req &&
> - s->target_residency_ns <= data->next_timer_ns) {
> - predicted_ns = s->target_residency_ns;
> - idx = i;
> - break;
> - }
> if (predicted_ns < TICK_NSEC)
> break;
>
I'm still fine with this and don't see a better way to solve the reported
issue generally, see the discussion on v1.
Rafael, do you have any objections?
We could make this conditional on there being a high latency difference between
polling and non-polling to keep x86 behavior.
Powered by blists - more mailing lists