[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgPAZ4KnCQergqAOUypwinYh=gZ0q4EQbwvuUcJ_8UK+Q@mail.gmail.com>
Date: Wed, 6 Mar 2024 11:46:10 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, linke li <lilinke99@...com>, joel@...lfernandes.org,
boqun.feng@...il.com, dave@...olabs.net, frederic@...nel.org,
jiangshanlai@...il.com, josh@...htriplett.org, linux-kernel@...r.kernel.org,
mathieu.desnoyers@...icios.com, qiang.zhang1211@...il.com,
quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer()
data race and concurrency bug
On Wed, 6 Mar 2024 at 11:27, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> Note this has nothing to do with tracing. This thread is in RCU. I just
> happen to receive the same patch "fix" for my code.
Ok, googling for rtort_pipe_count, I can only state that that code is
complete garbage.
And no amount of READ_ONCE/WRITE_ONCE will fix it.
For one thing, we have this code:
WRITE_ONCE(rp->rtort_pipe_count, i + 1);
if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
which is broken by design. The compiler is allowed to (and probably
does) turn that into just
WRITE_ONCE(rp->rtort_pipe_count, i + 1);
if (i + 1 >= RCU_TORTURE_PIPE_LEN) {
which only results in the question "Why didn't the source code do that
obvious simplification itself?"
So that code is actively *STUPID*. It's randomly mixing READ_ONCE and
regular reads in ways that just makes me go: "there's no saving this
shit".
This needs fixing. Having tests that have random code in them only
makes me doubt that the *TEST* itself is correct, rather than the code
it is trying to actually test.
And dammit, none of that makes sense anyway. This is not some
performance-crticial code. Why is it not using proper atomics if there
is an actual data race?
The reason to use READ_ONCE() and WRITE_ONCE() is that they can be a
lot faster than atomics, or - more commonly - because you have some
fundamental algorithm that doesn't do arithmetic, but cares about some
"state at time X" (the RCU _pointer_ being one such obvious case, but
doing an *increment* sure as hell isn't).
So using those READ_ONCE/WRITE_ONCE macros for that thing is
fundamntally wrong to begin with.
The question should not be "should we add another READ_ONCE()". The
question should be "what drugs were people on when writing this code"?
People - please just stop writing garbage.
That 'rtort_pipe_count' should be an atomic_t, and the "add one and
return the old value" should be an "atomic_inc_return()-1" (the "-1"
is because "inc_return" returns the *new* value).
And feel free to add "_relaxed()" to that atomic op because this code
doesn't care about ordering of that counter. It will help on some
architectures, but as mentioned, this is not performance-crticial code
to begin with.
Linus
Powered by blists - more mailing lists