lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <525a2352-dc57-45ca-adb2-f7039c37145e@arm.com>
Date: Mon, 17 Mar 2025 08:35:34 +0000
From: Christian Loehle <christian.loehle@....com>
To: Aboorva Devarajan <aboorvad@...ux.ibm.com>,
 "Rafael J . Wysocki" <rafael@...nel.org>,
 Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Gautam Menghani <gautam@...ux.ibm.com>, linux-pm@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] cpuidle: menu: Prefer polling state for short idle
 durations

On 3/17/25 06:03, Aboorva Devarajan wrote:
> Avoid selecting deep idle state when the predicted idle duration is
> shorter than its target residency, as this leads to unnecessary state
> transitions without energy savings.
> 
> On virtualized PowerPC (pseries) systems, where only one polling state
> (Snooze) and one deep state (CEDE) are available, selecting CEDE when
> its target residency exceeds the predicted idle duration hurts
> performance.
> 
> For example, if the predicted idle duration is 15 us and the first
> non-polling state has a target residency of 120 us, selecting it
> would be suboptimal.
> 
> Remove the condition introduced in commit 69d25870f20c
> ("cpuidle: fix the menu governor to boost IO performance") that
> prioritized non-polling states even when their target residency
> exceeded the predicted idle duration and allow polling states to
> be selected when appropriate.
> 
> Performance improvement observed with pgbench on PowerPC (pseries)
> system:
> +---------------------------+------------+------------+------------+
> | Metric                    | Baseline   | Patched    | Change (%) |
> +---------------------------+------------+------------+------------+
> | Transactions/sec (TPS)    | 494,834    | 538,707    | +8.85%     |
> | Avg latency (ms)          | 0.162      | 0.149      | -8.02%     |
> +---------------------------+------------+------------+------------+
> 
> CPUIdle state usage:
> +--------------+--------------+-------------+
> | Metric       | Baseline     | Patched     |
> +--------------+--------------+-------------+
> | Total usage  | 12,703,630   | 13,941,966  |
> | Above usage  | 11,388,990   | 1,620,474   |
> | Below usage  | 19,973       | 684,708     |
> +--------------+--------------+-------------+
> 
> Above/Total and Below/Total usage percentages:
> +------------------------+-----------+---------+
> | Metric                 | Baseline  | Patched |
> +------------------------+-----------+---------+
> | Above % (Above/Total)  | 89.67%    | 11.63%  |
> | Below % (Below/Total)  | 0.16%     | 4.91%   |
> | Total cpuidle miss (%) | 89.83%    | 16.54%  |
> +------------------------+-----------+---------+
> 
> Signed-off-by: Aboorva Devarajan <aboorvad@...ux.ibm.com>
> 
> ---
> 
> v1: https://lore.kernel.org/all/20240809073120.250974-1-aboorvad@linux.ibm.com/
> 
> v1 -> v2:
> 
> - Drop cover letter and improve commit message.
> ---
>  drivers/cpuidle/governors/menu.c | 11 -----------
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
> index 28363bfa3e4c..4b199377e4be 100644
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -296,17 +296,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>  			idx = i; /* first enabled state */
>  
>  		if (s->target_residency_ns > predicted_ns) {
> -			/*
> -			 * Use a physical idle state, not busy polling, unless
> -			 * a timer is going to trigger soon enough.
> -			 */
> -			if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> -			    s->exit_latency_ns <= latency_req &&
> -			    s->target_residency_ns <= data->next_timer_ns) {
> -				predicted_ns = s->target_residency_ns;
> -				idx = i;
> -				break;
> -			}
>  			if (predicted_ns < TICK_NSEC)
>  				break;
>  

I'm still fine with this and don't see a better way to solve the reported
issue generally, see the discussion on v1.
Rafael, do you have any objections?
We could make this conditional on there being a high latency difference between
polling and non-polling to keep x86 behavior.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ