[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <29a20fd4-ac5e-44a3-bc8a-9f77aa6a3cf9@paulmck-laptop>
Date: Wed, 6 Mar 2024 19:21:14 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
qiang.zhang1211@...il.com, quic_neeraju@...cinc.com,
rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
bug
On Wed, Mar 06, 2024 at 06:49:38PM -0800, Linus Torvalds wrote:
> On Wed, 6 Mar 2024 at 18:43, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > I dunno.
>
> Oh, and just looking at that patch, I still think the code is confused.
>
> On the reading side, we have:
>
> pipe_count = smp_load_acquire(&p->rtort_pipe_count);
> if (pipe_count > RCU_TORTURE_PIPE_LEN) {
> /* Should not happen, but... */
>
> where that comment clearly says that the pipe_count we read (whether
> with READ_ONCE() or with my smp_load_acquire() suggestion) should
> never be larger than RCU_TORTURE_PIPE_LEN.
I will fix that comment. It should not happen *if* RCU is working
correctly. It can happen if you have an RCU that is so broken that a
single RCU reader can span more than ten grace periods. An example of
an RCU that really is this broken can be selected using rcutorture's
torture_type=busted module parameter. No surprise, given that its
implementation of call_rcu() invokes the callback function directly and
its implementation of synchronize_rcu() is a complete no-op. ;-)
Of course, the purpose of that value of the torture_type module parameter
(along with all other possible values containing the string "busted")
is to test rcutorture itself.
> But the writing side very clearly did:
>
> i = rp->rtort_pipe_count;
> if (i > RCU_TORTURE_PIPE_LEN)
> i = RCU_TORTURE_PIPE_LEN;
> ...
> smp_store_release(&rp->rtort_pipe_count, ++i);
>
> (again, syntactically it could have been "i + 1" instead of my "++i" -
> same value), so clearly the writing side *can* write a value that is >
> RCU_TORTURE_PIPE_LEN.
>
> So while the whole READ/WRITE_ONCE vs smp_load_acquire/store_release
> is one thing that might be worth looking at, I think there are other
> very confusing aspects here.
With this change in that comment, are things better?
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 6b821a7037b03..0cb5452ecd945 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2000,7 +2000,8 @@ static bool rcu_torture_one_read(struct torture_random_state *trsp, long myid)
preempt_disable();
pipe_count = READ_ONCE(p->rtort_pipe_count);
if (pipe_count > RCU_TORTURE_PIPE_LEN) {
- /* Should not happen, but... */
+ // Should not happen in a correct RCU implementation,
+ // happens quite often for torture_type=busted.
pipe_count = RCU_TORTURE_PIPE_LEN;
}
completed = cur_ops->get_gp_seq();
Powered by blists - more mailing lists