[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb794e83-992e-8181-d9b9-acc68536ce5a@efficios.com>
Date: Wed, 21 Dec 2022 12:11:42 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: Joel Fernandes <joel@...lfernandes.org>,
linux-kernel@...r.kernel.org,
Josh Triplett <josh@...htriplett.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>, rcu@...r.kernel.org,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [RFC 0/2] srcu: Remove pre-flip memory barrier
On 2022-12-21 06:59, Frederic Weisbecker wrote:
> On Tue, Dec 20, 2022 at 10:34:19PM -0500, Mathieu Desnoyers wrote:
[...]
>>
>> The memory ordering constraint I am concerned about here is:
>>
>> * [...] In addition,
>> * each CPU having an SRCU read-side critical section that extends beyond
>> * the return from synchronize_srcu() is guaranteed to have executed a
>> * full memory barrier after the beginning of synchronize_srcu() and before
>> * the beginning of that SRCU read-side critical section. [...]
>>
>> So if we have a SRCU read-side critical section that begins after the beginning
>> of synchronize_srcu, but before its first memory barrier, it would miss the
>> guarantee that the full memory barrier is issued before the beginning of that
>> SRCU read-side critical section. IOW, that memory barrier needs to be at the
>> very beginning of the grace period.
>
> I'm confused, what's wrong with this ?
>
> UPDATER READER
> ------- ------
> STORE X = 1 STORE srcu_read_lock++
> // rcu_seq_snap() smp_mb()
> smp_mb() READ X
> // scans
> READ srcu_read_lock
What you refer to here is only memory ordering of the store to X and
load from X wrt loading/increment of srcu_read_lock, which is internal
to the srcu implementation. If we really want to model the provided
high-level memory ordering guarantees, we should consider a scenario
where SRCU is used for its memory ordering properties to synchronize
other variables.
I'm concerned about the following Dekker scenario, where
synchronize_srcu() and srcu_read_lock/unlock would be used instead of
memory barriers:
Initial state: X = 0, Y = 0
Thread A Thread B
---------------------------------------------
STORE X = 1 STORE Y = 1
synchronize_srcu()
srcu_read_lock()
r1 = LOAD X
srcu_read_unlock()
r0 = LOAD Y
BUG_ON(!r0 && !r1)
So in the synchronize_srcu implementation, there appears to be two
major scenarios: either srcu_gp_start_if_needed starts a gp or expedited
gp, or it uses an already started gp/expedited gp. When snapshotting
with rcu_seq_snap, the fact that the memory barrier is after the
ssp->srcu_gp_seq load means that it does not order prior memory accesses
before that load. This sequence value is then used to identify which
gp_seq to wait for when piggy-backing on another already-started gp. I
worry about reordering between STORE X = 1 and load of ssp->srcu_gp_seq,
which is then used to piggy-back on an already-started gp.
I suspect that the implicit barrier in srcu_read_lock() invoked at the
beginning of srcu_gp_start_if_needed() is really the barrier that makes
all this behave as expected. But without documentation it's rather hard
to follow.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists