lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87pl73ew6h.fsf@oracle.com>
Date: Tue, 20 Jan 2026 14:49:58 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Will Deacon <will@...nel.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-pm@...r.kernel.org, bpf@...r.kernel.org, arnd@...db.de,
        catalin.marinas@....com, peterz@...radead.org,
        akpm@...ux-foundation.org, mark.rutland@....com, harisokn@...zon.com,
        cl@...two.org, ast@...nel.org, rafael@...nel.org,
        daniel.lezcano@...aro.org, memxor@...il.com, zhenglifeng1@...wei.com,
        xueshuai@...ux.alibaba.com, joao.m.martins@...cle.com,
        boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v8 04/12] arm64: support WFET in smp_cond_relaxed_timeout()


Will Deacon <will@...nel.org> writes:

> On Fri, Jan 09, 2026 at 11:05:06AM -0800, Ankur Arora wrote:
>> 
>> Will Deacon <will@...nel.org> writes:
>> 
>> > On Sun, Dec 14, 2025 at 08:49:11PM -0800, Ankur Arora wrote:
>> >> Extend __cmpwait_relaxed() to __cmpwait_relaxed_timeout() which takes
>> >> an additional timeout value in ns.
>> >>
>> >> Lacking WFET, or with zero or negative value of timeout we fallback
>> >> to WFE.
>> >>
>> >> Cc: Arnd Bergmann <arnd@...db.de>
>> >> Cc: Catalin Marinas <catalin.marinas@....com>
>> >> Cc: Will Deacon <will@...nel.org>
>> >> Cc: linux-arm-kernel@...ts.infradead.org
>> >> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
>> >> ---
>> >>  arch/arm64/include/asm/barrier.h |  8 ++--
>> >>  arch/arm64/include/asm/cmpxchg.h | 72 ++++++++++++++++++++++----------
>> >>  2 files changed, 55 insertions(+), 25 deletions(-)
>> >
>> > Sorry, just spotted something else on this...
>> >
>> >> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> >> index 6190e178db51..fbd71cd4ef4e 100644
>> >> --- a/arch/arm64/include/asm/barrier.h
>> >> +++ b/arch/arm64/include/asm/barrier.h
>> >> @@ -224,8 +224,8 @@ do {									\
>> >>  extern bool arch_timer_evtstrm_available(void);
>> >>
>> >>  /*
>> >> - * In the common case, cpu_poll_relax() sits waiting in __cmpwait_relaxed()
>> >> - * for the ptr value to change.
>> >> + * In the common case, cpu_poll_relax() sits waiting in __cmpwait_relaxed()/
>> >> + * __cmpwait_relaxed_timeout() for the ptr value to change.
>> >>   *
>> >>   * Since this period is reasonably long, choose SMP_TIMEOUT_POLL_COUNT
>> >>   * to be 1, so smp_cond_load_{relaxed,acquire}_timeout() does a
>> >> @@ -234,7 +234,9 @@ extern bool arch_timer_evtstrm_available(void);
>> >>  #define SMP_TIMEOUT_POLL_COUNT	1
>> >>
>> >>  #define cpu_poll_relax(ptr, val, timeout_ns) do {			\
>> >> -	if (arch_timer_evtstrm_available())				\
>> >> +	if (alternative_has_cap_unlikely(ARM64_HAS_WFXT))		\
>> >> +		__cmpwait_relaxed_timeout(ptr, val, timeout_ns);	\
>> >> +	else if (arch_timer_evtstrm_available())			\
>> >>  		__cmpwait_relaxed(ptr, val);				\
>> >
>> > Don't you want to make sure that we have the event stream available for
>> > __cmpwait_relaxed_timeout() too? Otherwise, a large timeout is going to
>> > cause problems.
>> 
>> Would that help though? If called from smp_cond_load_relaxed_timeout()
>> then we would wake up and just call __cmpwait_relaxed_timeout() again.
>
> Fair enough, I can see that. Is it worth capping the maximum timeout
> like we do for udelay()?

The DELAY_CONST_MAX thing?

So, I'm not sure your concern is about the overall timeout or timeout
per WFET iteration?

For the overall limit, at least rqspinlock has a pretty large timeout
value (NSEC_PER_SEC/4).

However, it might be a good idea to attach a DELAY_CONST_MAX like limit
when using this interface -- for architectures that do not have an optimized
way of polling/define ARCH_HAS_CPU_RELAX.

(Currently only x86 defines ARCH_HAS_CPU_RELAX but I had a series which
is meant to go after this that renames it to ARCH_HAS_ OPTIMIZED_POLL
and selects it for x86 and arm64 [1].)

But that still might mean that we could have fairly long WFET iterations.
Do you forsee a problem with that?

[1] https://lore.kernel.org/lkml/20250218213337.377987-1-ankur.a.arora@oracle.com/

Thanks
-- 
ankur

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ