lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z740eIZcK31DQETq@gmail.com>
Date: Tue, 25 Feb 2025 22:22:00 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Yujun Dong <yujundong@...cal-lab.net>
Cc: Valentin Schneider <vschneid@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpuidle, sched: Use smp_mb__after_atomic() in
 current_clr_polling()


[ Sorry about the belated reply, found this in my TODO pile ... ]

* Yujun Dong <yujundong@...cal-lab.net> wrote:

> In architectures that use the polling bit, current_clr_polling() employs
> smp_mb() to ensure that the clearing of the polling bit is visible to
> other cores before checking TIF_NEED_RESCHED.
> 
> However, smp_mb() can be costly. Given that clear_bit() is an atomic
> operation, replacing smp_mb() with smp_mb__after_atomic() is appropriate.
> 
> Many architectures implement smp_mb__after_atomic() as a lighter-weight
> barrier compared to smp_mb(), leading to performance improvements.
> For instance, on x86, smp_mb__after_atomic() is a no-op. This change
> eliminates a smp_mb() instruction in the cpuidle wake-up path, saving
> several CPU cycles and thereby reducing wake-up latency.
> 
> Architectures that do not use the polling bit will retain the original
> smp_mb() behavior to ensure that existing dependencies remain unaffected.
> 
> Signed-off-by: Yujun Dong <yujundong@...cal-lab.net>
> ---
>  include/linux/sched/idle.h | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h
> index e670ac282333..439f6029d3b9 100644
> --- a/include/linux/sched/idle.h
> +++ b/include/linux/sched/idle.h
> @@ -79,6 +79,21 @@ static __always_inline bool __must_check current_clr_polling_and_test(void)
>  	return unlikely(tif_need_resched());
>  }
>  
> +static __always_inline void current_clr_polling(void)
> +{
> +	__current_clr_polling();
> +
> +	/*
> +	 * Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
> +	 * Once the bit is cleared, we'll get IPIs with every new
> +	 * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
> +	 * fold.
> +	 */
> +	smp_mb__after_atomic(); /* paired with resched_curr() */
> +
> +	preempt_fold_need_resched();
> +}
> +
>  #else
>  static inline void __current_set_polling(void) { }
>  static inline void __current_clr_polling(void) { }
> @@ -91,21 +106,15 @@ static inline bool __must_check current_clr_polling_and_test(void)
>  {
>  	return unlikely(tif_need_resched());
>  }
> -#endif
>  
>  static __always_inline void current_clr_polling(void)
>  {
>  	__current_clr_polling();
>  
> -	/*
> -	 * Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
> -	 * Once the bit is cleared, we'll get IPIs with every new
> -	 * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
> -	 * fold.
> -	 */
>  	smp_mb(); /* paired with resched_curr() */

So this part is weird: you remove the comment that justifies the 
smp_mb(), but you leave the smp_mb() in place. Why?

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ