lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aM2CR3peZkQlL0S1@arm.com>
Date: Fri, 19 Sep 2025 17:18:15 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Will Deacon <will@...nel.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
	linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	bpf@...r.kernel.org, arnd@...db.de, peterz@...radead.org,
	akpm@...ux-foundation.org, mark.rutland@....com,
	harisokn@...zon.com, cl@...two.org, ast@...nel.org,
	memxor@...il.com, zhenglifeng1@...wei.com,
	xueshuai@...ux.alibaba.com, joao.m.martins@...cle.com,
	boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v5 2/5] arm64: barrier: Add
 smp_cond_load_relaxed_timeout()

On Thu, Sep 18, 2025 at 09:05:22PM +0100, Will Deacon wrote:
> > +/* Re-declared here to avoid include dependency. */
> > +extern bool arch_timer_evtstrm_available(void);
> > +
> > +#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)	\
> > +({									\
> > +	typeof(ptr) __PTR = (ptr);					\
> > +	__unqual_scalar_typeof(*ptr) VAL;				\
> > +	bool __wfe = arch_timer_evtstrm_available();			\
> > +									\
> > +	for (;;) {							\
> > +		VAL = READ_ONCE(*__PTR);				\
> > +		if (cond_expr)						\
> > +			break;						\
> > +		if (time_check_expr)					\
> > +			break;						\
> > +		if (likely(__wfe))					\
> > +			__cmpwait_relaxed(__PTR, VAL);			\
> > +		else							\
> > +			cpu_relax();					\
> 
> It'd be an awful lot nicer if we could just use the generic code if
> wfe isn't available. One option would be to make that available as
> e.g. __smp_cond_load_relaxed_timeout_cpu_relax() and call it from the
> arch code when !arch_timer_evtstrm_available() but a potentially cleaner
> version would be to introduce something like cpu_poll_relax() and use
> that in the core code.
> 
> So arm64 would do:
> 
> #define SMP_TIMEOUT_SPIN_COUNT	1
> #define cpu_poll_relax(ptr, val)	do {				\
> 	if (arch_timer_evtstrm_available())				\
> 		__cmpwait_relaxed(ptr, val);				\
> 	else								\
> 		cpu_relax();						\
> } while (0)
> 
> and then the core code would have:
> 
> #ifndef cpu_poll_relax
> #define cpu_poll_relax(p, v)	cpu_relax()
> #endif

A slight problem here is that we have two users that want different spin
counts: poll_idle() uses 200, rqspinlock wants 16K. They've been
empirically chosen but I guess it also depends on what they call in
time_check_expr and the resolution they need. From the discussion on
patch 5, Kumar would like to override the spin count to 16K from the
current one of 200 (or if poll_idle works with 16K, we just set that as
the default; we have yet to hear from the cpuidle folk).

I guess on arm64 we'd first #undef it and redefine it as 1. The
non-event stream variant is for debug only really, I'd expect it to
always have it on in production (or go for WFET).

So yeah, I think the above would work. Ankur proposed something similar
in the early versions but I found it too complicated (a spin and wait
policy callback populating the spin variable). Your proposal looks a lot
simpler.

Thanks.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ