lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a653ce24-8dba-4a17-a3ce-68b49c99dc8d@paulmck-laptop>
Date: Thu, 7 Nov 2024 06:15:31 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Zilin Guan <zilinguan811@...il.com>
Cc: boqun.feng@...il.com, frederic@...nel.org, jiangshanlai@...il.com,
	joel@...lfernandes.org, josh@...htriplett.org,
	linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
	neeraj.upadhyay@...nel.org, qiang.zhang1211@...il.com,
	rcu@...r.kernel.org, rostedt@...dmis.org, urezki@...il.com,
	xujianhao01@...il.com
Subject: Re: [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in
 __note_gp_changes()

On Thu, Nov 07, 2024 at 02:01:17PM +0000, Zilin Guan wrote:
> On Wed, Nov 06, 2024 at 12:18:25PM -0800, Paul E. McKenney wrote:
> > Good eyes!!!
> > 
> > But did you find this with KCSAN, or by visual inspection?
> > 
> > The reason that I ask is that the __note_gp_changes() should be
> > invoked with the leaf rnp->lock held, which should exclude writes to
> > the rdp->gpwrap fields for all CPUs corresponding to that leaf rcu_node
> > structure.
> > 
> > Note the raw_lockdep_assert_held_rcu_node(rnp) call at the beginning of
> > this function.
> > 
> > So I believe that the proper fix is to *remove* READ_ONCE() from accesses
> > to rdp->gpwrap in this function.
> > 
> > Or am I missing something here?
> > 
> >                                                         Thanx, Paul
> 
> I found this by visual inspection.

Good eyes!  ;-)

> When reviewing the function __note_gp_changes(), I noticed that other 
> accesses to rdp->gpwrap are protected with either READ_ONCE() or 
> WRITE_ONCE(), which led me to suspect a potential data race at line 1305.
> 
> However, I am not certain whether holding rnp->lock protects access to 
> rdp->gpwrap in this case. If it indeed ensures that no concurrent writes
> can occur, then I agree that the correct approach would be to remove 
> READ_ONCE() from those accesses.

One way to check this is via inspection of all the updates to the
->gpwrap field.

Another approach is to run KCSAN, for example, from the top-level
directory of the Linux-kernel source tree on a system with qemu/KVM
enabled:

	tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 30m --configs "4*TREE03" --kconfigs "CONFIG_NR_CPUS=4" --kcsan --trust-make

This particular command is set up for my 16-CPU laptop.  You can of
course adjust the "4*" and the "=4" to match your hardware.  For example,
on a 64-CPU system you might instead do this:

	tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 30m --configs "8*TREE03" --kconfigs "CONFIG_NR_CPUS=8" --kcsan --trust-make

Please see Documentation/dev-tools/kcsan.rst for information on how
to interpret KCSAN reports.

This will find false positives in the non-RCU portions of the kernel,
so you should look for reports involving __note_gp_changes() and/or
its callers (inlining and all that).

So why not try it?  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ