lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d53a5852-f84a-4dae-9bf4-312751880452@paulmck-laptop>
Date: Sat, 8 Nov 2025 10:38:32 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Will Deacon <will@...nel.org>
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
	rostedt@...dmis.org, Catalin Marinas <catalin.marinas@....com>,
	Mark Rutland <mark.rutland@....com>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	linux-arm-kernel@...ts.infradead.org, bpf@...r.kernel.org,
	frederic@...nel.org
Subject: Re: [PATCH v2 15/16] srcu: Optimize SRCU-fast-updown for arm64

On Sat, Nov 08, 2025 at 01:07:45PM +0000, Will Deacon wrote:
> Hi Paul,
> 
> On Wed, Nov 05, 2025 at 12:32:15PM -0800, Paul E. McKenney wrote:
> > Some arm64 platforms have slow per-CPU atomic operations, for example,
> > the Neoverse V2.  This commit therefore moves SRCU-fast from per-CPU
> > atomic operations to interrupt-disabled non-read-modify-write-atomic
> > atomic_read()/atomic_set() operations.  This works because
> > SRCU-fast-updown is not invoked from read-side primitives, which
> > means that if srcu_read_unlock_fast() NMI handlers.  This means that
> > srcu_read_lock_fast_updown() and srcu_read_unlock_fast_updown() can
> > exclude themselves and each other
> > 
> > This reduces the overhead of calls to srcu_read_lock_fast_updown() and
> > srcu_read_unlock_fast_updown() from about 100ns to about 12ns on an ARM
> > Neoverse V2.  Although this is not excellent compared to about 2ns on x86,
> > it sure beats 100ns.
> > 
> > This command was used to measure the overhead:
> > 
> > tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --configs NOPREEMPT --kconfig "CONFIG_NR_CPUS=64 CONFIG_TASKS_TRACE_RCU=y" --bootargs "refscale.loops=100000 refscale.guest_os_delay=5 refscale.nreaders=64 refscale.holdoff=30 torture.disable_onoff_at_boot refscale.scale_type=srcu-fast-updown refscale.verbose_batched=8 torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=8 refscale.nruns=100" --trust-make
> > 
> > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Mark Rutland <mark.rutland@....com>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> > Cc: Steven Rostedt <rostedt@...dmis.org>
> > Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> > Cc: <linux-arm-kernel@...ts.infradead.org>
> > Cc: <bpf@...r.kernel.org>
> > ---
> >  include/linux/srcutree.h | 51 +++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 48 insertions(+), 3 deletions(-)
> 
> I've queued the per-cpu tweak from Catalin in the arm64 fixes tree [1]
> for 6.18, so please can you drop this SRCU commit from your tree?

Very good!  Adding Frederic on CC since he is doing the pull request
for the upcoming merge window.

But if this doesn't show up in -rc1, we reserve the right to put it
back in.

Sorry, couldn't resist!   ;-)

							Thanx, Paul

> Cheers,
> 
> Will
> 
> [1] https://git.kernel.org/arm64/c/535fdfc5a228

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ