[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2721d70f-67a8-428f-903b-1c7c6c6da7f9@efficios.com>
Date: Thu, 7 Mar 2024 14:53:43 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: paulmck@...nel.org
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
linux-kernel@...r.kernel.org, qiang.zhang1211@...il.com,
quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
bug
On 2024-03-07 14:47, Paul E. McKenney wrote:
> On Thu, Mar 07, 2024 at 08:53:05AM -0500, Mathieu Desnoyers wrote:
>> On 2024-03-06 22:37, Paul E. McKenney wrote:
>>> On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote:
>> [...]
>>>
>>>> As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern
>>>> is concerned, the only valid use-case I can think of is
>>>> split counters or RCU implementations where there is a
>>>> single updater doing the increment, and one or more
>>>> concurrent reader threads that need to snapshot a
>>>> consistent value with READ_ONCE().
>>>
>> [...]
>>>
>>> So what would you use that pattern for?
>>>
>>> One possibility is a per-CPU statistical counter in userspace on a
>>> fastpath, in cases where losing the occasional count is OK. Then learning
>>> your CPU (and possibly being immediately migrated to some other CPU),
>>> READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not)
>>> make sense.
>>>
>>> I suppose the same in the kernel if there was a fastpath so extreme you
>>> could not afford to disable preemption.
>>>
>>> At best, very niche.
>>>
>>> Or am I suffering a failure of imagination yet again? ;-)
>>
>> The (niche) use-cases I have in mind are split-counters and RCU
>> grace period tracking, where precise counters totals are needed
>> (no lost count).
>>
>> In the kernel, this could be:
>
> Thank you for looking into this!
>
>> - A per-cpu counter, each counter incremented from thread context with
>> preemption disabled (single updater per counter), read concurrently by
>> other threads. WRITE_ONCE/READ_ONCE is useful to make sure there
>> is no store/load tearing there. Atomics on the update would be stronger
>> than necessary on the increment fast-path.
>
> But if preemption is disabled, the updater can read the value without
> READ_ONCE() without risk of concurrent update. Or are you concerned about
> interrupt handlers? This would have to be a read from the interrupt
> handler, given that an updated from the interrupt handler could result
> in a lost count.
You are correct that the updater don't need READ_ONCE there. It would
however require a WRITE_ONCE() to match READ_ONCE() from concurrent
reader threads.
>
>> - A per-thread counter (e.g. within task_struct), only incremented by the
>> single thread, read by various threads concurrently.
>
> Ditto.
Right, only WRITE_ONCE() on the single updater, READ_ONCE() on readers.
>
>> - A counter which increment happens to be already protected by a lock, read
>> by various threads without taking the lock. (technically doable, but
>> I'm not sure I see a relevant use-case for it)
>
> In that case, the lock would exclude concurrent updates, so the lock
> holder would not need READ_ONCE(), correct?
Correct.
>
>> In user-space:
>>
>> - The "per-cpu" counter would have to use rseq for increments to prevent
>> inopportune migrations, which needs to be implemented in assembler anyway.
>> The counter reads would have to use READ_ONCE().
>
> Fair enough!
>
>> - The per-thread counter (Thread-Local Storage) incremented by a single
>> thread, read by various threads concurrently, is a good target
>> for WRITE_ONCE()/READ_ONCE() pairing. This is actually what we do in
>> various liburcu implementations which track read-side critical sections
>> per-thread.
>
> Agreed, but do any of these use WRITE_ONCE(x, READ_ONCE(x) + 1) or
> similar?
Not quite, I recall it's more like WRITE_ONCE(x, READ_ONCE(y)) or such,
so we can grab the value of the current gp counter and store it into a
TLS variable.
>
>> - Same as for the kernel, a counter increment protected by a lock which
>> needs to be read from various threads concurrently without taking
>> the lock could be a valid use-case, though I fail to see how it is
>> useful due to lack of imagination on my part. ;-)
>
> In RCU, we have "WRITE_ONCE(*sp, *sp + 1)" for this use case, though
> here we have the WRITE_ONCE() but not the READ_ONCE() because we hold
> the lock excluding any other updates.
You are right, the READ_ONCE() is not needed in this case for the
updater, only for the concurrent readers.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists