[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bebbed4a-ced1-42c5-865c-dc9dc7857b6c@efficios.com>
Date: Thu, 7 Mar 2024 08:53:05 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: paulmck@...nel.org
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
linux-kernel@...r.kernel.org, qiang.zhang1211@...il.com,
quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
bug
On 2024-03-06 22:37, Paul E. McKenney wrote:
> On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote:
[...]
>
>> As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern
>> is concerned, the only valid use-case I can think of is
>> split counters or RCU implementations where there is a
>> single updater doing the increment, and one or more
>> concurrent reader threads that need to snapshot a
>> consistent value with READ_ONCE().
>
[...]
>
> So what would you use that pattern for?
>
> One possibility is a per-CPU statistical counter in userspace on a
> fastpath, in cases where losing the occasional count is OK. Then learning
> your CPU (and possibly being immediately migrated to some other CPU),
> READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not)
> make sense.
>
> I suppose the same in the kernel if there was a fastpath so extreme you
> could not afford to disable preemption.
>
> At best, very niche.
>
> Or am I suffering a failure of imagination yet again? ;-)
The (niche) use-cases I have in mind are split-counters and RCU
grace period tracking, where precise counters totals are needed
(no lost count).
In the kernel, this could be:
- A per-cpu counter, each counter incremented from thread context with
preemption disabled (single updater per counter), read concurrently by
other threads. WRITE_ONCE/READ_ONCE is useful to make sure there
is no store/load tearing there. Atomics on the update would be stronger
than necessary on the increment fast-path.
- A per-thread counter (e.g. within task_struct), only incremented by the
single thread, read by various threads concurrently.
- A counter which increment happens to be already protected by a lock, read
by various threads without taking the lock. (technically doable, but
I'm not sure I see a relevant use-case for it)
In user-space:
- The "per-cpu" counter would have to use rseq for increments to prevent
inopportune migrations, which needs to be implemented in assembler anyway.
The counter reads would have to use READ_ONCE().
- The per-thread counter (Thread-Local Storage) incremented by a single
thread, read by various threads concurrently, is a good target
for WRITE_ONCE()/READ_ONCE() pairing. This is actually what we do in
various liburcu implementations which track read-side critical sections
per-thread.
- Same as for the kernel, a counter increment protected by a lock which
needs to be read from various threads concurrently without taking
the lock could be a valid use-case, though I fail to see how it is
useful due to lack of imagination on my part. ;-)
I'm possibly missing other use-cases, but those come to mind as not
involving racy counter increments.
I agree that use-cases are so niche that we probably do _not_ want to
introduce ADD_SHARED() for that purpose in a common header file,
because I suspect that it would then become misused in plenty of
scenarios where the updates are actually racy and would be better
served by atomics or local-atomics.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists