linux-kernel - Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bebbed4a-ced1-42c5-865c-dc9dc7857b6c@efficios.com>
Date: Thu, 7 Mar 2024 08:53:05 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: paulmck@...nel.org
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
 Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
 joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
 frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
 linux-kernel@...r.kernel.org, qiang.zhang1211@...il.com,
 quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
 rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
 bug

On 2024-03-06 22:37, Paul E. McKenney wrote:
> On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote:
[...]
> 
>> As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern
>> is concerned, the only valid use-case I can think of is
>> split counters or RCU implementations where there is a
>> single updater doing the increment, and one or more
>> concurrent reader threads that need to snapshot a
>> consistent value with READ_ONCE().
> 
[...]
> 
> So what would you use that pattern for?
> 
> One possibility is a per-CPU statistical counter in userspace on a
> fastpath, in cases where losing the occasional count is OK.  Then learning
> your CPU (and possibly being immediately migrated to some other CPU),
> READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not)
> make sense.
> 
> I suppose the same in the kernel if there was a fastpath so extreme you
> could not afford to disable preemption.
> 
> At best, very niche.
> 
> Or am I suffering a failure of imagination yet again?  ;-)

The (niche) use-cases I have in mind are split-counters and RCU
grace period tracking, where precise counters totals are needed
(no lost count).

In the kernel, this could be:

- A per-cpu counter, each counter incremented from thread context with
   preemption disabled (single updater per counter), read concurrently by
   other threads. WRITE_ONCE/READ_ONCE is useful to make sure there
   is no store/load tearing there. Atomics on the update would be stronger
   than necessary on the increment fast-path.

- A per-thread counter (e.g. within task_struct), only incremented by the
   single thread, read by various threads concurrently.

- A counter which increment happens to be already protected by a lock, read
   by various threads without taking the lock. (technically doable, but
   I'm not sure I see a relevant use-case for it)

In user-space:

- The "per-cpu" counter would have to use rseq for increments to prevent
   inopportune migrations, which needs to be implemented in assembler anyway.
   The counter reads would have to use READ_ONCE().

- The per-thread counter (Thread-Local Storage) incremented by a single
   thread, read by various threads concurrently, is a good target
   for WRITE_ONCE()/READ_ONCE() pairing. This is actually what we do in
   various liburcu implementations which track read-side critical sections
   per-thread.

- Same as for the kernel, a counter increment protected by a lock which
   needs to be read from various threads concurrently without taking
   the lock could be a valid use-case, though I fail to see how it is
   useful due to lack of imagination on my part. ;-)

I'm possibly missing other use-cases, but those come to mind as not
involving racy counter increments.

I agree that use-cases are so niche that we probably do _not_ want to
introduce ADD_SHARED() for that purpose in a common header file,
because I suspect that it would then become misused in plenty of
scenarios where the updates are actually racy and would be better
served by atomics or local-atomics.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com