linux-kernel - Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <72b14322-78c1-4479-9c4e-b0e11c1f0d53@paulmck-laptop>
Date: Wed, 6 Mar 2024 19:37:13 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
	joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
	frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
	linux-kernel@...r.kernel.org, qiang.zhang1211@...il.com,
	quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
 rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
 bug

On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote:
> On 2024-03-06 21:43, Linus Torvalds wrote:
> [...]
> > 
> > Honestly, this all makes me think that we'd be *much* better off
> > showing the real "handoff" with smp_store_release() and
> > smp_load_acquire().
> 
> We've done something similar in liburcu (userspace code) to allow
> Thread Sanitizer to understand the happens-before relationships
> within the RCU implementations and lock-free data structures.
> 
> Moving to load-acquire/store-release (C11 model in our case)
> allowed us to provide enough happens-before relationship for
> Thread Sanitizer to understand what is happening under the
> hood in liburcu and perform relevant race detection of user
> code.

Good point!

In the kernel, though, KCSAN understands the Linux-kernel memory model,
and so we don't get that sort of false positive.

> As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern
> is concerned, the only valid use-case I can think of is
> split counters or RCU implementations where there is a
> single updater doing the increment, and one or more
> concurrent reader threads that need to snapshot a
> consistent value with READ_ONCE().

It is wrong here.  OK, not wrong from a functional viewpoint, but it
is at best very confusing.  I am applying patches to fix this.  I will
push out a new "dev" branch on -rcu that will make this function appear
as shown below.

So what would you use that pattern for?

One possibility is a per-CPU statistical counter in userspace on a
fastpath, in cases where losing the occasional count is OK.  Then learning
your CPU (and possibly being immediately migrated to some other CPU),
READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not)
make sense.

I suppose the same in the kernel if there was a fastpath so extreme you
could not afford to disable preemption.

At best, very niche.

Or am I suffering a failure of imagination yet again?  ;-)

							Thanx, Paul

------------------------------------------------------------------------

static bool
rcu_torture_pipe_update_one(struct rcu_torture *rp)
{
	int i;
	struct rcu_torture_reader_check *rtrcp = READ_ONCE(rp->rtort_chkp);

	if (rtrcp) {
		WRITE_ONCE(rp->rtort_chkp, NULL);
		smp_store_release(&rtrcp->rtc_ready, 1); // Pair with smp_load_acquire().
	}
	i = rp->rtort_pipe_count;
	if (i > RCU_TORTURE_PIPE_LEN)
		i = RCU_TORTURE_PIPE_LEN;
	atomic_inc(&rcu_torture_wcount[i]);
	WRITE_ONCE(rp->rtort_pipe_count, i + 1);
	ASSERT_EXCLUSIVE_WRITER(rp->rtort_pipe_count);
	if (i + 1 >= RCU_TORTURE_PIPE_LEN) {
		rp->rtort_mbtest = 0;
		return true;
	}
	return false;
}