lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2721d70f-67a8-428f-903b-1c7c6c6da7f9@efficios.com>
Date: Thu, 7 Mar 2024 14:53:43 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: paulmck@...nel.org
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
 Steven Rostedt <rostedt@...dmis.org>, linke li <lilinke99@...com>,
 joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
 frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
 linux-kernel@...r.kernel.org, qiang.zhang1211@...il.com,
 quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
 rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency
 bug

On 2024-03-07 14:47, Paul E. McKenney wrote:
> On Thu, Mar 07, 2024 at 08:53:05AM -0500, Mathieu Desnoyers wrote:
>> On 2024-03-06 22:37, Paul E. McKenney wrote:
>>> On Wed, Mar 06, 2024 at 10:06:21PM -0500, Mathieu Desnoyers wrote:
>> [...]
>>>
>>>> As far as the WRITE_ONCE(x, READ_ONCE(x) + 1) pattern
>>>> is concerned, the only valid use-case I can think of is
>>>> split counters or RCU implementations where there is a
>>>> single updater doing the increment, and one or more
>>>> concurrent reader threads that need to snapshot a
>>>> consistent value with READ_ONCE().
>>>
>> [...]
>>>
>>> So what would you use that pattern for?
>>>
>>> One possibility is a per-CPU statistical counter in userspace on a
>>> fastpath, in cases where losing the occasional count is OK.  Then learning
>>> your CPU (and possibly being immediately migrated to some other CPU),
>>> READ_ONCE() of the count, increment, and WRITE_ONCE() might (or might not)
>>> make sense.
>>>
>>> I suppose the same in the kernel if there was a fastpath so extreme you
>>> could not afford to disable preemption.
>>>
>>> At best, very niche.
>>>
>>> Or am I suffering a failure of imagination yet again?  ;-)
>>
>> The (niche) use-cases I have in mind are split-counters and RCU
>> grace period tracking, where precise counters totals are needed
>> (no lost count).
>>
>> In the kernel, this could be:
> 
> Thank you for looking into this!
> 
>> - A per-cpu counter, each counter incremented from thread context with
>>    preemption disabled (single updater per counter), read concurrently by
>>    other threads. WRITE_ONCE/READ_ONCE is useful to make sure there
>>    is no store/load tearing there. Atomics on the update would be stronger
>>    than necessary on the increment fast-path.
> 
> But if preemption is disabled, the updater can read the value without
> READ_ONCE() without risk of concurrent update.  Or are you concerned about
> interrupt handlers?  This would have to be a read from the interrupt
> handler, given that an updated from the interrupt handler could result
> in a lost count.

You are correct that the updater don't need READ_ONCE there. It would
however require a WRITE_ONCE() to match READ_ONCE() from concurrent
reader threads.

> 
>> - A per-thread counter (e.g. within task_struct), only incremented by the
>>    single thread, read by various threads concurrently.
> 
> Ditto.

Right, only WRITE_ONCE() on the single updater, READ_ONCE() on readers.

> 
>> - A counter which increment happens to be already protected by a lock, read
>>    by various threads without taking the lock. (technically doable, but
>>    I'm not sure I see a relevant use-case for it)
> 
> In that case, the lock would exclude concurrent updates, so the lock
> holder would not need READ_ONCE(), correct?

Correct.

> 
>> In user-space:
>>
>> - The "per-cpu" counter would have to use rseq for increments to prevent
>>    inopportune migrations, which needs to be implemented in assembler anyway.
>>    The counter reads would have to use READ_ONCE().
> 
> Fair enough!
> 
>> - The per-thread counter (Thread-Local Storage) incremented by a single
>>    thread, read by various threads concurrently, is a good target
>>    for WRITE_ONCE()/READ_ONCE() pairing. This is actually what we do in
>>    various liburcu implementations which track read-side critical sections
>>    per-thread.
> 
> Agreed, but do any of these use WRITE_ONCE(x, READ_ONCE(x) + 1) or
> similar?

Not quite, I recall it's more like WRITE_ONCE(x, READ_ONCE(y)) or such,
so we can grab the value of the current gp counter and store it into a
TLS variable.


> 
>> - Same as for the kernel, a counter increment protected by a lock which
>>    needs to be read from various threads concurrently without taking
>>    the lock could be a valid use-case, though I fail to see how it is
>>    useful due to lack of imagination on my part. ;-)
> 
> In RCU, we have "WRITE_ONCE(*sp, *sp + 1)" for this use case, though
> here we have the WRITE_ONCE() but not the READ_ONCE() because we hold
> the lock excluding any other updates.

You are right, the READ_ONCE() is not needed in this case for the
updater, only for the concurrent readers.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ