lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240609224701.pc6om2o5ep6btywe@airbuntu>
Date: Sun, 9 Jun 2024 23:47:01 +0100
From: Qais Yousef <qyousef@...alina.io>
To: Christian Loehle <christian.loehle@....com>
Cc: linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
	rafael@...nel.org, vincent.guittot@...aro.org, peterz@...radead.org,
	daniel.lezcano@...aro.org, anna-maria@...utronix.de,
	kajetan.puchalski@....com, lukasz.luba@....com,
	dietmar.eggemann@....com
Subject: Re: [PATCH 1/6] cpuidle: teo: Increase util-threshold

On 06/06/24 10:00, Christian Loehle wrote:
> Increase the util-threshold by a lot as it was low enough for some
> minor load to always be active, especially on smaller CPUs.
> 
> For small cap CPUs (Pixel6) the util threshold is as low as 1.
> For CPUs of capacity <64 it is 0. So ensure it is at a minimum, too.
> 
> Fixes: 9ce0f7c4bc64 ("cpuidle: teo: Introduce util-awareness")
> Reported-by: Qais Yousef <qyousef@...alina.io>
> Reported-by: Vincent Guittot <vincent.guittot@...aro.org>
> Suggested-by: Kajetan Puchalski <kajetan.puchalski@....com>
> Signed-off-by: Christian Loehle <christian.loehle@....com>
> ---
>  drivers/cpuidle/governors/teo.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
> index 7244f71c59c5..45f43e2ee02d 100644
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -146,13 +146,11 @@
>   * The number of bits to shift the CPU's capacity by in order to determine
>   * the utilized threshold.
>   *
> - * 6 was chosen based on testing as the number that achieved the best balance
> - * of power and performance on average.
> - *
>   * The resulting threshold is high enough to not be triggered by background
> - * noise and low enough to react quickly when activity starts to ramp up.
> + * noise.
>   */
> -#define UTIL_THRESHOLD_SHIFT 6
> +#define UTIL_THRESHOLD_SHIFT 2
> +#define UTIL_THRESHOLD_MIN 50
>  
>  /*
>   * The PULSE value is added to metrics when they grow and the DECAY_SHIFT value
> @@ -671,7 +669,8 @@ static int teo_enable_device(struct cpuidle_driver *drv,
>  	int i;
>  
>  	memset(cpu_data, 0, sizeof(*cpu_data));
> -	cpu_data->util_threshold = max_capacity >> UTIL_THRESHOLD_SHIFT;
> +	cpu_data->util_threshold = max(UTIL_THRESHOLD_MIN,
> +				max_capacity >> UTIL_THRESHOLD_SHIFT);

Thanks for trying to fix this. But I am afraid this is not a solution. There's
no magic number that can truly work here - we tried. As I tried to explain
before, a higher util value doesn't mean long idle time is unlikely. And
blocked load can cause problems where a decay can take too long.

We are following up with the suggestions I have thrown back then and we'll
share results if anything actually works.

For now, I think a revert is more appropriate. There was some perf benefit, but
the power regressions were bad and there's no threshold value that actually
works. The thresholding concept itself is incorrect and flawed - it seemed the
correct thing back then, yes. But in a hindsight now it doesn't work.


Thanks!

--
Qais Yousef

>  
>  	for (i = 0; i < NR_RECENT; i++)
>  		cpu_data->recent_idx[i] = -1;
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ