lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2024 13:41:13 +0100
From: Christian Loehle <christian.loehle@....com>
To: Dietmar Eggemann <dietmar.eggemann@....com>, linux-pm@...r.kernel.org,
 linux-kernel@...r.kernel.org, rafael@...nel.org
Cc: vincent.guittot@...aro.org, qyousef@...alina.io, peterz@...radead.org,
 daniel.lezcano@...aro.org, ulf.hansson@...aro.org, anna-maria@...utronix.de,
 kajetan.puchalski@....com, lukasz.luba@....com
Subject: Re: [PATCHv2 2/3] cpuidle: teo: Remove recent intercepts metric

On 6/26/24 11:44, Dietmar Eggemann wrote:
> On 11/06/2024 13:24, Christian Loehle wrote:
>> The logic for recent intercepts didn't work, there is an underflow
>> of the 'recent' value that can be observed during boot already, which
>> teo usually doesn't recover from, making the entire logic pointless.
>> Furthermore the recent intercepts also were never reset, thus not
>> actually being very 'recent'.
>>
>> Having underflowed 'recent' values lead to teo always acting as if
>> we were in a scenario were expected sleep length based on timers is
>> too high and it therefore unnecessarily selecting shallower states.
>>
>> Experiments show that the remaining 'intercept' logic is enough to
>> quickly react to scenarios in which teo cannot rely on the timer
>> expected sleep length.
>>
>> See also here:
>> https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/
> 
> So the fixes proposed in teo_update() there:
> 
> (1) add '&& cpu_data->state_bins[cpu_data->recent_idx[i]].recent' to the 
>     if condition to decrement 'cpu_data->state_bins[].recent' or

That fixes the underflow for sure, but doesn't make much sense either.
You still have a value called 'recent' that isn't actually reset regularly.
So an intercept-heavy workload can still affect the system
minutes/hours/days later. There is no reset nor EMA, so if anything, the
regular intercept counters would be the 'recent' one.
See also below.

> 
> (2) Set 'cpu_data->state_bins[].recent' = 0 instead of decrement

That also 'works' in the sense that it doesn't contain blatantly obvious
values then. It is basically a 1bit information about recent intercepts,
I didn't see much practical use for that, given that the EMA of intercepts
(See patch 3), is quite aggressive (i.e. 'recent').
If an additional intercept that is more 'recent' (and one that is less
'recent') was ever needed I don't know and would defer to Rafael here.
I don't see the need for one after this series is applied, though.

The two suggestions above are 'fixes', this v2 2/3 is now more of a
revert you could say. (I didn't tag it as such explicitly because the
commit did more than just add the 'recent' tracking, so it's not a revert.)

> 
> didn't work?
> 
>> Fixes: 77577558f25d ("cpuidle: teo: Rework most recent idle duration values treatment")
>> Signed-off-by: Christian Loehle <christian.loehle@....com>
>> ---
>>  drivers/cpuidle/governors/teo.c | 73 ++++++---------------------------
>>  1 file changed, 12 insertions(+), 61 deletions(-)
>>
>> diff --git a/drivers/cpuidle/governors/teo.c b/drivers/cpuidle/governors/teo.c
>> index d8554c20cf10..cc7df59f488d 100644
>> --- a/drivers/cpuidle/governors/teo.c
>> +++ b/drivers/cpuidle/governors/teo.c
>> @@ -59,10 +59,6 @@
>>   * shallower than the one whose bin is fallen into by the sleep length (these
>>   * situations are referred to as "intercepts" below).
>>   *
>> - * In addition to the metrics described above, the governor counts recent
>> - * intercepts (that is, intercepts that have occurred during the last
>> - * %NR_RECENT invocations of it for the given CPU) for each bin.
>> - *
>>   * In order to select an idle state for a CPU, the governor takes the following
>>   * steps (modulo the possible latency constraint that must be taken into account
>>   * too):
>> @@ -81,20 +77,15 @@
> 
>  66  * 1. Find the deepest CPU idle state whose target residency does not exceed
>  67  *    the current sleep length (the candidate idle state) and compute 3 sums as
> 
> s/3/2 ?

Thanks, good catch!

> 
> [...]
> 
>> @@ -320,8 +288,6 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
>>  	unsigned int tick_intercept_sum = 0;
>>  	unsigned int idx_intercept_sum = 0;
>>  	unsigned int intercept_sum = 0;
>> -	unsigned int idx_recent_sum = 0;
>> -	unsigned int recent_sum = 0;
>>  	unsigned int idx_hit_sum = 0;
>>  	unsigned int hit_sum = 0;
>>  	int constraint_idx = 0;
> 
> Looks like 'bool alt_intercepts, alt_recent' have to go here already and
> not in 3/3.

Correct, will pick that up.
Thank you for reviewing!

> 
> [...]


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ