linux-kernel - Re: [PATCH v2 1/2] cpuidle: governors: teo: Adjust the classification of wakeup events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <44c4ee5a-34ec-4b23-b06b-05bd0fda6585@arm.com>
Date: Thu, 29 Jan 2026 09:16:00 +0000
From: Christian Loehle <christian.loehle@....com>
To: "Rafael J. Wysocki" <rafael@...nel.org>,
 Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Doug Smythies <dsmythies@...us.net>
Subject: Re: [PATCH v2 1/2] cpuidle: governors: teo: Adjust the classification
 of wakeup events

On 1/26/26 19:45, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> 
> If differences between target residency values of adjacent idle states
> of a given CPU are relatively large, the corresponding idle state bins
> used by the teo governors are large either and the rule by which hits
> are distinguished from intercepts is inaccurate.
> 
> Namely, by that rule, a wakeup event is classified as a hit if the
> sleep length (the time till the closest timer other than the tick)
> and the measured idle duration, adjusted for the entered idle state
> exit latency, fall into the same idle state bin.  However, if that bin
> is large enough, the actual difference between the sleep length and
> the measured idle duration may be significant.  It may in fact be
> significantly greater than the analogous difference for an event where
> the sleep length and the measured idle duration fall into different
> bins.
> 
> For this reason, amend the rule in question with a check that will
> only allow a wakeup event to be counted as a hit if the difference
> between the sleep length and the measured idle duration is less than
> LATENCY_THRESHOLD_NS (which means that the difference between the
> sleep length and the raw measured idle duration is below the sum of
> LATENCY_THRESHOLD_NS and 1/2 of the entered idle state exit latency).
> Otherwise, the event will be counted as an intercept.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> ---
> 
> v1.1 -> v2: No changes
> 
> v1 -> v1.1
>    * Drop the change in teo_select() along with the corresponding
>      part of the changelog (after receiving testing feedback from
>      Christian)
> 
> This is a resend of
> 
> https://lore.kernel.org/linux-pm/4707705.LvFx2qVVIh@rafael.j.wysocki/
> 
> It applies on top of the first three patches from
> 
> https://lore.kernel.org/linux-pm/2257365.irdbgypaU6@rafael.j.wysocki/
> 
> ---
>  drivers/cpuidle/governors/teo.c |   32 ++++++++++++++++----------------
>  1 file changed, 16 insertions(+), 16 deletions(-)
> 
> --- a/drivers/cpuidle/governors/teo.c
> +++ b/drivers/cpuidle/governors/teo.c
> @@ -48,13 +48,11 @@
>   * in accordance with what happened last time.
>   *
>   * The "hits" metric reflects the relative frequency of situations in which the
> - * sleep length and the idle duration measured after CPU wakeup fall into the
> - * same bin (that is, the CPU appears to wake up "on time" relative to the sleep
> - * length).  In turn, the "intercepts" metric reflects the relative frequency of
> - * non-timer wakeup events for which the measured idle duration falls into a bin
> - * that corresponds to an idle state shallower than the one whose bin is fallen
> - * into by the sleep length (these events are also referred to as "intercepts"
> - * below).
> + * sleep length and the idle duration measured after CPU wakeup are close enough
> + * (that is, the CPU appears to wake up "on time" relative to the sleep length).
> + * In turn, the "intercepts" metric reflects the relative frequency of non-timer
> + * wakeup events for which the measured idle duration is measurably less than
> + * the sleep length (these events are also referred to as "intercepts" below).
>   *
>   * The governor also counts "intercepts" with the measured idle duration below
>   * the tick period length and uses this information when deciding whether or not
> @@ -253,12 +251,16 @@ static void teo_update(struct cpuidle_dr
>  	}
>  
>  	/*
> -	 * If the measured idle duration falls into the same bin as the sleep
> -	 * length, this is a "hit", so update the "hits" metric for that bin.
> +	 * If the measured idle duration falls into the same bin as the
> +	 * sleep length and the difference between them is less than
> +	 * LATENCY_THRESHOLD_NS, this is a "hit", so update the "hits"
> +	 * metric for that bin.
> +	 *
>  	 * Otherwise, update the "intercepts" metric for the bin fallen into by
>  	 * the measured idle duration.
>  	 */
> -	if (idx_timer == idx_duration) {
> +	if (idx_timer == idx_duration &&
> +	    cpu_data->sleep_length_ns - measured_ns < LATENCY_THRESHOLD_NS) {

So it needs to be within 7.5us here.
Can we always expect that to be true?
Especially since measured_ns does this "infer average from worst-case exit
latency" handling.
On deeper states this
measured_ns -= lat_ns / 2;
is an order of magnitude higher than our threshold.

So it should probably be something like
exit_latency / 2 + LATENCY_THRESHOLD_NS?
Or just exit_latency and allow the error to both sides?

>  		cpu_data->state_bins[idx_timer].hits += PULSE;
>  	} else {
>  		cpu_data->state_bins[idx_duration].intercepts += PULSE;
> 
> 
>