linux-kernel - Re: [RFT][PATCH v1] cpuidle: teo: Avoid selecting deepest idle state over-eagerly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <09472579-d59a-4be9-996b-1638228895ac@arm.com>
Date: Tue, 18 Feb 2025 11:28:48 +0000
From: Christian Loehle <christian.loehle@....com>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>,
 Linux PM <linux-pm@...r.kernel.org>, dsmythies@...us.net,
 LKML <linux-kernel@...r.kernel.org>,
 Daniel Lezcano <daniel.lezcano@...aro.org>,
 Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>,
 Aboorva Devarajan <aboorvad@...ux.ibm.com>
Subject: Re: [RFT][PATCH v1] cpuidle: teo: Avoid selecting deepest idle state
 over-eagerly

On 2/14/25 21:34, Rafael J. Wysocki wrote:
> On Thu, Feb 13, 2025 at 3:08 PM Christian Loehle
> <christian.loehle@....com> wrote:
>>
>> On 2/4/25 20:58, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>
>>> It has been observed that the recent teo governor update which concluded
>>> with commit 16c8d7586c19 ("cpuidle: teo: Skip sleep length computation
>>> for low latency constraints") caused the max-jOPS score of the SPECjbb
>>> 2015 benchmark [1] on Intel Granite Rapids to decrease by around 1.4%.
>>> While it may be argued that this is not a significant increase, the
>>> previous score can be restored by tweaking the inequality used by teo
>>> to decide whether or not to preselect the deepest enabled idle state.
>>> That change also causes the critical-jOPS score of SPECjbb to increase
>>> by around 2%.
>>>
>>> Namely, the likelihood of selecting the deepest enabled idle state in
>>> teo on the platform in question has increased after commit 13ed5c4a6d9c
>>> ("cpuidle: teo: Skip getting the sleep length if wakeups are very
>>> frequent") because some timer wakeups were previously counted as non-
>>> timer ones and they were effectively added to the left-hand side of the
>>> inequality deciding whether or not to preselect the deepest idle state.
>>>
>>> Many of them are now (accurately) counted as timer wakeups, so the left-
>>> hand side of that inequality is now effectively smaller in some cases,
>>> especially when timer wakeups often occur in the range below the target
>>> residency of the deepest enabled idle state and idle states with target
>>> residencies below CPUIDLE_FLAG_POLLING are often selected, but the
>>> majority of recent idle intervals are still above that value most of
>>> the time.  As a result, the deepest enabled idle state may be selected
>>> more often than it used to be selected in some cases.
>>>
>>> To counter that effect, add the sum of the hits metric for all of the
>>> idle states below the candidate one (which is the deepest enabled idle
>>> state at that point) to the left-hand side of the inequality mentioned
>>> above.  This will cause it to be more balanced because, in principle,
>>> putting both timer and non-timer wakeups on both sides of it is more
>>> consistent than only taking into account the timer wakeups in the range
>>> above the target residency of the deepest enabled idle state.
>>>
>>> Link: https://www.spec.org/jbb2015/
>>> Tested-by: Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>> ---
>>>  drivers/cpuidle/governors/teo.c |    6 +++---
>>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> --- a/drivers/cpuidle/governors/teo.c
>>> +++ b/drivers/cpuidle/governors/teo.c
>>> @@ -349,13 +349,13 @@
>>>       }
>>>
>>>       /*
>>> -      * If the sum of the intercepts metric for all of the idle states
>>> -      * shallower than the current candidate one (idx) is greater than the
>>> +      * If the sum of the intercepts and hits metric for all of the idle
>>> +      * states below the current candidate one (idx) is greater than the
>>>        * sum of the intercepts and hits metrics for the candidate state and
>>>        * all of the deeper states, a shallower idle state is likely to be a
>>>        * better choice.
>>>        */
>>> -     if (2 * idx_intercept_sum > cpu_data->total - idx_hit_sum) {
>>> +     if (2 * (idx_intercept_sum + idx_hit_sum) > cpu_data->total) {
>>>               int first_suitable_idx = idx;
>>>
>>>               /*
>>>
>>>
>>>
>>
>> I'm curious, are Doug's numbers reproducible?
>> Or could you share the idle state usage numbers? Is that explainable?
>> Seems like a lot and it does worry me that I can't reproduce anything
>> as drastic.
> 
> Well, it may not be drastic, but the results below pretty much confirm
> that this particular change isn't going in the right direction IMV.

Agreed, I'd still be eager to pick up something like Doug reported with
my tests too :(

> 
>> I did a bit of x86 as well and got for Raptor Lake (I won't post the
>> non-x86 numbers now, but teo-tweak performs comparable to teo mainline.)
>>
>> Idle 5 min:
>> device   gov     iter    Joules  idles   idle_misses     idle_miss_ratio         belows  aboves
>> teo     0       170.02  12690   646     0.051   371     275
>> teo     1       123.17  8361    517     0.062   281     236
>> teo     2       122.76  7741    347     0.045   262     85
>> teo     3       118.5   8699    668     0.077   307     361
>> teo     4       80.46   8113    443     0.055   264     179
>> teo-tweak       0       115.05  10223   803     0.079   323     480
>> teo-tweak       1       164.41  8523    631     0.074   263     368
>> teo-tweak       2       163.91  8409    711     0.085   256     455
>> teo-tweak       3       137.22  8581    721     0.084   261     460
>> teo-tweak       4       174.95  8703    675     0.078   261     414
> 
> So basically the energy usage goes up, idle misses go up, idle misses
> ratio goes up and the "above" misses go way up.  Not so good as far as
> I'm concerned.
> 
>> teo     0       164.34  8443    516     0.061   303     213
>> teo     1       167.85  8767    492     0.056   256     236
>> teo     2       166.25  7835    406     0.052   263     143
>> teo     3       189.77  8865    493     0.056   276     217
>> teo     4       136.97  9185    467     0.051   286     181
> 
> The above is menu I think?

No this is teo again, just wanted to include it because the variance is
quite large, (not unusual for idle).
The full table (with menu (mainline))

teo 	0 	170.02 	12690 	646 	0.051 	371 	275
teo 	1 	123.17 	8361 	517 	0.062 	281 	236
teo 	2 	122.76 	7741 	347 	0.045 	262 	85
teo 	3 	118.5 	8699 	668 	0.077 	307 	361
teo 	4 	80.46 	8113 	443 	0.055 	264 	179
teo-tweak 	0 	115.05 	10223 	803 	0.079 	323 	480
teo-tweak 	1 	164.41 	8523 	631 	0.074 	263 	368
teo-tweak 	2 	163.91 	8409 	711 	0.085 	256 	455
teo-tweak 	3 	137.22 	8581 	721 	0.084 	261 	460
teo-tweak 	4 	174.95 	8703 	675 	0.078 	261 	414
teo 	0 	164.34 	8443 	516 	0.061 	303 	213
teo 	1 	167.85 	8767 	492 	0.056 	256 	236
teo 	2 	166.25 	7835 	406 	0.052 	263 	143
teo 	3 	189.77 	8865 	493 	0.056 	276 	217
teo 	4 	136.97 	9185 	467 	0.051 	286 	181
menu 	0 	180.13 	8925 	343 	0.038 	303 	40
menu 	1 	208.49 	8717 	345 	0.040 	312 	33
menu 	2 	168.38 	8451 	321 	0.038 	274 	47
menu 	3 	139.48 	7853 	310 	0.039 	289 	21
menu 	4 	166.61 	7769 	339 	0.044 	322 	17

> 
>> At least in the idle case you can see an increase in 'above' idle_misses.

Agreed, idle_misses clearly go up as mentioned.

>>
>> Firefox Youtube 4K video playback 2 min:
>> device   gov     iter    Joules  idles   idle_misses     idle_miss_ratio         belows  aboves
>> teo     0       260.09  67404   11189   0.166   1899    9290
>> teo     1       273.71  76649   12155   0.159   2233    9922
>> teo     2       231.45  59559   10344   0.174   1747    8597
>> teo     3       202.61  58223   10641   0.183   1748    8893
>> teo     4       217.56  61411   10744   0.175   1809    8935
>> teo-tweak       0       227.99  61209   11251   0.184   2110    9141
>> teo-tweak       1       222.44  61959   10323   0.167   1474    8849
>> teo-tweak       2       218.1   64380   11080   0.172   1845    9235
>> teo-tweak       3       207.4   60183   11267   0.187   1929    9338
>> teo-tweak       4       217.46  61253   10381   0.169   1620    8761
> 
> And it doesn't improve things drastically here, although on average it
> does reduce energy usage.