lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87qzw2f1rv.fsf@oracle.com>
Date: Fri, 19 Sep 2025 16:41:56 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Will Deacon <will@...nel.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        bpf@...r.kernel.org, arnd@...db.de, catalin.marinas@....com,
        peterz@...radead.org, akpm@...ux-foundation.org, mark.rutland@....com,
        harisokn@...zon.com, cl@...two.org, ast@...nel.org, memxor@...il.com,
        zhenglifeng1@...wei.com, xueshuai@...ux.alibaba.com,
        joao.m.martins@...cle.com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com
Subject: Re: [PATCH v5 1/5] asm-generic: barrier: Add
 smp_cond_load_relaxed_timeout()


Will Deacon <will@...nel.org> writes:

> On Wed, Sep 10, 2025 at 08:46:51PM -0700, Ankur Arora wrote:
>> Add smp_cond_load_relaxed_timeout(), which extends
>> smp_cond_load_relaxed() to allow waiting for a duration.
>>
>> The additional parameter allows for the timeout check.
>>
>> The waiting is done via the usual cpu_relax() spin-wait around the
>> condition variable with periodic evaluation of the time-check.
>>
>> The number of times we spin is defined by SMP_TIMEOUT_SPIN_COUNT
>> (chosen to be 200 by default) which, assuming each cpu_relax()
>> iteration takes around 20-30 cycles (measured on a variety of x86
>> platforms), amounts to around 4000-6000 cycles.
>>
>> Cc: Arnd Bergmann <arnd@...db.de>
>> Cc: Will Deacon <will@...nel.org>
>> Cc: Catalin Marinas <catalin.marinas@....com>
>> Cc: Peter Zijlstra <peterz@...radead.org>
>> Cc: linux-arch@...r.kernel.org
>> Reviewed-by: Catalin Marinas <catalin.marinas@....com>
>> Reviewed-by: Haris Okanovic <harisokn@...zon.com>
>> Tested-by: Haris Okanovic <harisokn@...zon.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
>> ---
>>  include/asm-generic/barrier.h | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
>>
>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>> index d4f581c1e21d..8483e139954f 100644
>> --- a/include/asm-generic/barrier.h
>> +++ b/include/asm-generic/barrier.h
>> @@ -273,6 +273,41 @@ do {									\
>>  })
>>  #endif
>>
>> +#ifndef SMP_TIMEOUT_SPIN_COUNT
>> +#define SMP_TIMEOUT_SPIN_COUNT		200
>> +#endif
>> +
>> +/**
>> + * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
>> + * guarantees until a timeout expires.
>> + * @ptr: pointer to the variable to wait on
>> + * @cond: boolean expression to wait for
>> + * @time_check_expr: expression to decide when to bail out
>> + *
>> + * Equivalent to using READ_ONCE() on the condition variable.
>> + */
>> +#ifndef smp_cond_load_relaxed_timeout
>> +#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)	\
>> +({									\
>> +	typeof(ptr) __PTR = (ptr);					\
>> +	__unqual_scalar_typeof(*ptr) VAL;				\
>> +	u32 __n = 0, __spin = SMP_TIMEOUT_SPIN_COUNT;			\
>> +									\
>> +	for (;;) {							\
>> +		VAL = READ_ONCE(*__PTR);				\
>> +		if (cond_expr)						\
>> +			break;						\
>> +		cpu_relax();						\
>> +		if (++__n < __spin)					\
>> +			continue;					\
>> +		if (time_check_expr)					\
>> +			break;						\
>
> There's a funny discrepancy here when compared to the arm64 version in
> the next patch. Here, if we time out, then the value returned is
> potentially quite stale because it was read before the last cpu_relax().
> In the arm64 patch, the timeout check is before the cmpwait/cpu_relax(),
> which I think is better.

So, that's a good point. But, the return value being stale also seems to
be incorrect.

> Regardless, I think having the same behaviour for the two implementations
> would be a good idea.

Yeah agreed.

As you outlined in the other mail, how about something like this:

#ifndef smp_cond_load_relaxed_timeout
#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)	\
({									\
	typeof(ptr) __PTR = (ptr);					\
	__unqual_scalar_typeof(*ptr) VAL;				\
	u32 __n = 0, __poll = SMP_TIMEOUT_POLL_COUNT;			\
									\
	for (;;) {							\
		VAL = READ_ONCE(*__PTR);				\
		if (cond_expr)						\
			break;						\
		cpu_poll_relax();					\
		if (++__n < __poll)					\
			continue;					\
		if (time_check_expr) {					\
			VAL = READ_ONCE(*__PTR);			\
			break;						\
		}							\
		__n = 0;						\
	}								\
	(typeof(*ptr))VAL;						\
})
#endif

A bit uglier but if the cpu_poll_relax() was a successful WFE then the
value might be ~100us out of date.

Another option might be to just set some state in the time check and
bail out due to a "if (cond_expr || __timed_out)", but I don't want
to add more instructions in the spin path.

--
ankur

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ