lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZkU8lm2tjm_r9FpZ@pavilion.home>
Date: Thu, 16 May 2024 00:52:06 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Levi Yun <ppbuk5246@...il.com>, Joel Fernandes <joel@...lfernandes.org>,
	Vineeth Pillai <vineeth@...byteword.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>
Cc: anna-maria@...utronix.de, mingo@...nel.org, tglx@...utronix.de,
	Markus.Elfring@....de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] time/tick-sched: idle load balancing when nohz_full
 cpu becomes idle.

Le Thu, May 09, 2024 at 10:29:32AM +0100, Levi Yun a écrit :
> When nohz_full CPU stops tick in tick_nohz_irq_exit(),
> It wouldn't be chosen to perform idle load balancing because it doesn't
> call nohz_balance_enter_idle() in tick_nohz_idle_stop_tick() when it
> becomes idle.
> 
> Formerly, __tick_nohz_idle_enter() is called in both
> tick_nohz_irq_exit() and in do_idle().
> That's why commit a0db971e4eb6 ("nohz: Move idle balancer registration
> to the idle path") prevents nohz_full cpu which isn't yet
> idle state but tick is stopped from entering idle balance.
> 
> However, this prevents nohz_full cpu which already stops tick from
> entering idle balacne when this cpu really becomes idle state.
> 
> Currently, tick_nohz_idle_stop_tick() is only called in idle state and
> it calls nohz_balance_enter_idle(). this function tracks the CPU
> which is part of nohz.idle_cpus_mask with rq->nohz_tick_stopped properly.
> 
> Therefore, Change tick_nohz_idle_stop_tick() to call nohz_balance_enter_idle()
> without checking !was_stopped so that nohz_full cpu can be chosen to
> perform idle load balancing when it enters idle state.
> 
> Fixes: a0db971e4eb6 ("nohz: Move idle balancer registration to the idle path")
> Signed-off-by: Levi Yun <ppbuk5246@...il.com>
> ---
> v4:
> 	- Add fixes tags.
> 
> v3:
> 	- Rewording commit message.
> 
> v2:
> 	- Fix typos in commit message.
> 
>  kernel/time/tick-sched.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 71a792cd8936..31a4cd89782f 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1228,8 +1228,10 @@ void tick_nohz_idle_stop_tick(void)
>  		ts->idle_sleeps++;
>  		ts->idle_expires = expires;
> 
> -		if (!was_stopped && tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
> -			ts->idle_jiffies = ts->last_jiffies;
> +		if (tick_sched_flag_test(ts, TS_FLAG_STOPPED)) {
> +			if (!was_stopped)
> +				ts->idle_jiffies = ts->last_jiffies;
> +

I've taken some time to respond because your patch has raised more questions
while discussing this with Anna-Maria:

1) Is Idle load balancing actually relevant for nohz_full? HK_TYPE_MISC already
   prevent those CPUs from becoming idle load balancer. They can still be
   targets for load balancing but nohz_full CPUs are supposed to run only one
   task.

2) This is related to previous point: HK_TYPE_SCHED is never activated. It would
   prevent the CPU from even beeing part of idle load balancing. Should we
   remove it or plug it?
   

3) nohz_balance_enter_idle() is called when the tick is stopped for the first
   time and nohz_balance_exit_idle() is called from the tick. But that also
   applies to idle ticks. So if the load balancing triggers while the tick is
   stopped, nohz_balance_enter_idle() won't be re-called in the idle loop even
   though the tick is stopped (that would be fixed with your patch).

4) Why is nohz_balance_exit_idle() called from the tick and not from the idle
   exit path? Is it to avoid overhead?

I'm adding some scheduler people in Cc who might help answer some of those
questions.

Thanks.
   

>  			nohz_balance_enter_idle(cpu);
>  		}
>  	} else {
> --
> 2.41.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ