linux-kernel - Re: [PATCH] cpuidle, sched: Use smp_mb__after_atomic() in current_clr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250306164217.3028977-1-yujundong@pascal-lab.net>
Date: Fri,  7 Mar 2025 09:53:44 +0800
From: Yujun Dong <yujundong@...cal-lab.net>
To: mingo@...nel.org
Cc: linux-kernel@...r.kernel.org,
	peterz@...radead.org,
	vincent.guittot@...aro.org,
	vschneid@...hat.com,
	yujundong@...cal-lab.net
Subject: Re: [PATCH] cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling()

Hi Ingo, 

* Ingo Molnar <mingo@...nel.org> wrote:
>
> [ Sorry about the belated reply, found this in my TODO pile ... ]
>
> * Yujun Dong <yujundong@...cal-lab.net> wrote:
>
>> In architectures that use the polling bit, current_clr_polling() employs
>> smp_mb() to ensure that the clearing of the polling bit is visible to
>> other cores before checking TIF_NEED_RESCHED.
>>
>> However, smp_mb() can be costly. Given that clear_bit() is an atomic
>> operation, replacing smp_mb() with smp_mb__after_atomic() is appropriate.
>>
>> Many architectures implement smp_mb__after_atomic() as a lighter-weight
>> barrier compared to smp_mb(), leading to performance improvements.
>> For instance, on x86, smp_mb__after_atomic() is a no-op. This change
>> eliminates a smp_mb() instruction in the cpuidle wake-up path, saving
>> several CPU cycles and thereby reducing wake-up latency.
>>
>> Architectures that do not use the polling bit will retain the original
>> smp_mb() behavior to ensure that existing dependencies remain unaffected.
>>
>> Signed-off-by: Yujun Dong <yujundong@...cal-lab.net>
>> ---
>>  include/linux/sched/idle.h | 23 ++++++++++++++++-------
>>  1 file changed, 16 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h
>> index e670ac282333..439f6029d3b9 100644
>> --- a/include/linux/sched/idle.h
>> +++ b/include/linux/sched/idle.h
>> @@ -79,6 +79,21 @@ static __always_inline bool __must_check current_clr_polling_and_test(void)
>>  	return unlikely(tif_need_resched());
>>  }
>>  
>> +static __always_inline void current_clr_polling(void)
>> +{
>> +	__current_clr_polling();
>> +
>> +	/*
>> +	 * Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
>> +	 * Once the bit is cleared, we'll get IPIs with every new
>> +	 * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
>> +	 * fold.
>> +	 */
>> +	smp_mb__after_atomic(); /* paired with resched_curr() */
>> +
>> +	preempt_fold_need_resched();
>> +}
>> +
>>  #else
>>  static inline void __current_set_polling(void) { }
>>  static inline void __current_clr_polling(void) { }
>> @@ -91,21 +106,15 @@ static inline bool __must_check current_clr_polling_and_test(void)
>>  {
>>  	return unlikely(tif_need_resched());
>>  }
>> -#endif
>>  
>>  static __always_inline void current_clr_polling(void)
>>  {
>>  	__current_clr_polling();
>>  
>> -	/*
>> -	 * Ensure we check TIF_NEED_RESCHED after we clear the polling bit.
>> -	 * Once the bit is cleared, we'll get IPIs with every new
>> -	 * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also
>> -	 * fold.
>> -	 */
>>  	smp_mb(); /* paired with resched_curr() */
>
> So this part is weird: you remove the comment that justifies the 
> smp_mb(), but you leave the smp_mb() in place. Why?
>
> Thanks,
>
> 	Ingo

Thanks for pointing that out. The comment removal in the non-polling
branch was intentional, but my original explanation was unclear. Let
me rephrase:

Polling architectures (with the TIF_POLLING flag):
1. __current_clr_polling() performs atomic ops -> 
Use smp_mb__after_atomic()
2. Keep original "clear polling bit" comment as it directly explains
the barrier's purpose.

Non-polling architectures (#else branch):
1. __current_clr_polling() is a no-op -> Original comment about
"clearing the bit" becomes misleading.
2. However, the smp_mb() must remain to preserve pre-existing memory
ordering guarantees. And explicitly documenting it requires new
wording to avoid confusion.

Proposed approaches:
Option A: Add a comment for non-polling smp_mb() like "Paired with
resched_curr(), as per pre-existing memory ordering guarantees"
Option B: Leave code as-is (no comment) and elaborate in the commit
message: "For non-polling architectures, retain smp_mb() to avoid
subtle regressions, while intentionally omitting the bit-specific
comment that no longer applies."

Which direction would you consider most maintainable? Your insight
would be greatly appreciated.

Best regards,
Yujun