lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc9896f2470c70519c3b9257a1a2dd32e5e9c6e9.camel@sipsolutions.net>
Date: Thu, 24 Apr 2025 11:30:39 +0200
From: Benjamin Berg <benjamin@...solutions.net>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, "Paul E. McKenney"
	 <paulmck@...nel.org>
Cc: linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org, 
	johannes@...solutions.net, Daniel Gomez <da.gomez@...sung.com>, Luis
 Chamberlain	 <mcgrof@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Petr Pavlu	 <petr.pavlu@...e.com>, Sami Tolvanen <samitolvanen@...gle.com>,
 Thomas Gleixner	 <tglx@...utronix.de>
Subject: Re: [PATCH v3 15/28] module: Use RCU in all users of
 __module_text_address().

On Thu, 2025-04-24 at 11:05 +0200, Sebastian Andrzej Siewior wrote:
> On 2025-04-23 11:16:49 [-0700], Paul E. McKenney wrote:
> > On Wed, Apr 23, 2025 at 05:17:31PM +0200, Benjamin Berg wrote:
> > > Hi,
> > > 
> > > On Wed, 2025-01-08 at 10:04 +0100, Sebastian Andrzej Siewior wrote:
> > > > __module_text_address() can be invoked within a RCU section, there is no
> > > > requirement to have preemption disabled.
> > > > 
> > > > Replace the preempt_disable() section around __module_text_address()
> > > > with RCU.
> > > 
> > > Unfortunately, this patch causes a performance regression for us. The
> > > trouble is that we enable kmemleak and run trace-cmd so a lot of stack
> > > traces need to be collected. Obviously, we also have lockdep enabled.
> > > 
> > > Now, combine this with the UML stack dumping code calling into
> > > __kernel_text_address a lot[1] and it really has a relevant performance
> > > impact. I saw the kernel spending 40% of its own CPU time just on the
> > > lock in is_module_text_address.
> > > 
> > > Maybe kernel_text_address should leave the RCU handling to the caller
> > > and assume that the RCU read lock is already taken?
> > > 
> > > Benjamin
> > > 
> > > [1] The UM arch dump_stack function reads every "unsigned long" on the
> > > stack and tests it using __kernel_text_address.
> > 
> > Use of a single guard(rcu)() is regressing performance?  Interesting and
> > quite unexpected.  That said, tiven the amount of debug you have enabled,
> > I am not so sure that people are going to be all that excited about a
> > further performance regression.
> > 
> > But is this regression due to the cleanup hook that guard(rcu)()
> > registers?  If so, please feel free to try using rcu_read_lock()
> > and rcu_read_unlock() instead.  I would be surprised if this makes a
> > difference, but then again, your initial regression report also comes
> > as a surprise, so...
> > 
> > Another way to reduce guard(rcu)() overhead is to build your kernel
> > with CONFIG_PREEMPT_NONE=y.  Not so good for real-time response, but
> > then again, neither are your debug options.
> 
> The guard notation is not regression I guess it is just the plenty of
> rcu_read_lock()/ unlock(). We had one regression which was "fixed" by
> commit ee57ab5a32129 ("locking/lockdep: Disable KASAN instrumentation of lockdep.c").

Yup, we really pretty much created a micro-benchmark for grabbing stack
traces.

> My guess would be that this is a preemptible kernel and the preempt
> disable/ enable is cheaper that the RCU version. So going back to a
> non-preemtible kernel should "fix" it.

Yes, preempt_disable() is extremely cheap.

> Looking at kernel_text_address(), is_bpf_text_address() has also a
> RCU read section so probably subject to the same trouble. And
> is_ftrace_trampoline() could be also converted to RCU which would
> increase the trouble. 
> 
> Improving the stack trace on UM or caching some of the most common one
> might help. Not sure if disabling kmemleak for lockdep is possible/
> makes a difference.

What does seem to help is to simply disable lockdep inside dump_trace.
That should be good enough for us at least, bringing the overhead down
to a manageable amount when running these tests.
Some unscientific numbers:

config                               dump_trace     locking
----
no locking (preempt_disable)            6 %         -
guard(rcu)() + lockdep_off             15 %         58 % of that
rcu_read_lock + lockdep_off            17 %         60 % of that
guard(rcu)()                           48 %         91 % of that

That confirms that guard(rcu)() really is not a problem. There might be
slight overhead, but it is probably within the margin of error. Turning
lockdep off/on inside the UML dump_trace() function brings down the
overhead a lot and I guess that should be an acceptable level for us.

Not sure if something like that would be desirable upstream. This is
happening for us when running the hostap "hwsim" tests inside UML (with
time-travel). At least internally, we could carry a custom patch to add
the lockdep_off()/lockdep_on() to dump_trace in order to work around
it[1].

Benjamin

[1] Actually, now I am reminded that we already have that for kmemleak
as lockdep was considerably slowing down the scanning.

> 
> > 							Thanx, Paul
> 
> Sebastian
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ