linux-kernel - Re: [PATCH v1 3/3] cpuidle: governors: menu: Special-case nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9104c434-9025-4365-8127-28014ddddc8d@arm.com>
Date: Thu, 14 Aug 2025 15:09:15 +0100
From: Christian Loehle <christian.loehle@....com>
To: "Rafael J. Wysocki" <rafael@...nel.org>,
 Linux PM <linux-pm@...r.kernel.org>
Cc: Frederic Weisbecker <frederic@...nel.org>,
 LKML <linux-kernel@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v1 3/3] cpuidle: governors: menu: Special-case nohz_full
 CPUs

On 8/13/25 11:29, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> 
> When the menu governor runs on a nohz_full CPU and there are no user
> space timers in the workload on that CPU, it ends up selecting idle
> states with target residency values above TICK_NSEC all the time due to
> a tick_nohz_tick_stopped() check designed for a different use case.
> Namely, on nohz_full CPUs the fact that the tick has been stopped does
> not actually mean anything in particular, whereas in the other case it
> indicates that previously the CPU was expected to be idle sufficiently
> long for the tick to be stopped, so it is not unreasonable to expect
> it to be idle beyond the tick period length again.
>   
> In some cases, this behavior causes latency in the workload to grow
> undesirably.  It may also cause the workload to consume more energy
> than necessary if the CPU does not spend enough time in the selected
> deep idle states.
> 
> Address this by amending the tick_nohz_tick_stopped() check in question
> with a tick_nohz_full_cpu() one to avoid using the time till the next
> timer event as the predicted_ns value all the time on nohz_full CPUs.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> ---
>  drivers/cpuidle/governors/menu.c |   12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -293,8 +293,18 @@
>  	 * in a shallow idle state for a long time as a result of it.  In that
>  	 * case, say we might mispredict and use the known time till the closest
>  	 * timer event for the idle state selection.
> +	 *
> +	 * However, on nohz_full CPUs the tick does not run as a rule and the
> +	 * time till the closest timer event may always be effectively infinite,
> +	 * so using it as a replacement for the predicted idle duration would
> +	 * effectively always cause the prediction results to be discarded and
> +	 * deep idle states to be selected all the time.  That might introduce
> +	 * unwanted latency into the workload and cause more energy than
> +	 * necessary to be consumed if the discarded prediction results are
> +	 * actually accurate, so skip nohz_full CPUs here.
>  	 */
> -	if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
> +	if (tick_nohz_tick_stopped() && !tick_nohz_full_cpu(dev->cpu) &&
> +	    predicted_ns < TICK_NSEC)
>  		predicted_ns = data->next_timer_ns;
>  
>  	/*
> 
> 
> 

OTOH the behaviour with $SUBJECT possibly means that we use predicted_ns from
get_typical_interval() (which may suggest picking a shallow state based on
previous wakeup patterns) only then to never wake up again?