[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0fae983b-2a7c-d44e-8881-53d5cc053f09@huaweicloud.com>
Date: Thu, 19 Jan 2023 14:39:01 +0100
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: paulmck@...nel.org
Cc: Alan Stern <stern@...land.harvard.edu>,
Andrea Parri <parri.andrea@...il.com>,
Jonas Oberhauser <jonas.oberhauser@...wei.com>,
Peter Zijlstra <peterz@...radead.org>, will <will@...nel.org>,
"boqun.feng" <boqun.feng@...il.com>, npiggin <npiggin@...il.com>,
dhowells <dhowells@...hat.com>,
"j.alglave" <j.alglave@....ac.uk>,
"luc.maranget" <luc.maranget@...ia.fr>, akiyks <akiyks@...il.com>,
dlustig <dlustig@...dia.com>, joel <joel@...lfernandes.org>,
urezki <urezki@...il.com>,
quic_neeraju <quic_neeraju@...cinc.com>,
frederic <frederic@...nel.org>,
Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Internal vs. external barriers (was: Re: Interesting LKMM litmus
test)
On 1/19/2023 1:11 AM, Paul E. McKenney wrote:
> On Wed, Jan 18, 2023 at 10:24:50PM +0100, Jonas Oberhauser wrote:
>> What I was thinking of is more something like this:
>>
>> P0{
>> idx1 = srcu_down(&ss);
>> srcu_up(&ss,idx1);
>> }
>>
>> P1{
>> idx2 = srcu_down(&ss);
>> srcu_up(&ss,idx2)
>> }
> And srcu_read_lock() and srcu_read_unlock() already do this.
I think I left out too much from my example.
And filling in the details led me down a bit of a rabbit hole of
confusion for a while.
But here's what I ended up with:
P0{
idx1 = srcu_down(&ss);
store_rel(p1, true);
shared cs
R x == ?
while (! load_acq(p2));
R idx2 == idx1 // for some reason, we got lucky!
srcu_up(&ss,idx1);
}
P1{
idx2 = srcu_down(&ss);
store_rel(p2, true);
shared cs
R y == 0
while (! load_acq(p1));
srcu_up(&ss,idx2);
}
P2 {
W y = 1
srcu_sync(&ss);
W x = 1
}
Assuming that like indicated above both threads happen to read the same
index, are you guaranteed that the shared cs lasts until both P0 and P1
have performed their final up?
Is it allowed for P0 to read x==1?
If you define matching up&down as you do through the data link, then we
get something like
P1's down ->po;prop;po grace period
thus
P1's up ->rcu-order grace period
P0's down ->po;hb;po P1's up ->rcu-order grace period
P0's up ->srcu-rscsi;rcu-link;rcu-order grace-period
Unfortunately, this is not enough to rcu-order P0's up with the grace
period -- you'd need a second rcu-gp for that!
Looking at it from the other side, because x reads x=1, we have
grace period ->po;rfe;po P0's up
and thus
grace period ->rcu-order P0's down ->po;hb;po P1's up
but again this would order the grace period with P1's up because you'd
need a second grace period.
When sketching it out on paper, I couldn't find any forbidden cycles,
and so x==1 would be allowed. (But as I side, I managed to confuse
myself a few times with this, so if you find a forbidden cycle let me know).
But note that the synchronization in there and the fact that both have
the same index ensures that the two grace periods overlap, in a
hypothetical order it would be
down() -> down() -> up() -> up()
(with any premutation of P0 and P1 over these events so that they each
get 1 up() and 1 down()) and thus the grace period must actually end
after both, or start before both.
With the definition I offered, you would get
P0's up() ->srcu-rscsi P1's down()
and
P1's up() ->srcu-rscsi P0's down()
and in particular
Rx1 ->po P0's up() ->srcu-rscsi P1's down() ->po Ry0 ->prop Wy1 ->po
srcu-gp on the same loc ->po Wx1 ->rfe Rx1
which can be collapsed to
Rx1 ->po;rcu-order;po;hb Rx1 which isn't irreflexive
Thus x==1 would be forbidden.
This is more semaphore-like, where the same cookie shared between
threads implies that it's the same semaphore, and the overlapping
guarantee (through synchronizing on p1,p2 in the beginning) means that
the critical sections overlap.
In contrast, I wouldn't suggest the same for srcu_lock and srcu_unlock,
where even though you may get the same cookie by accident, those might
still be two completely independent critical sections.
For example, you could imagine a single percpu counter _cnt (per index
of course) that is incremented and decremented for lock() and unlock(),
and the condition to pass an srcu_sync() of a given index is that the
cpu[...]._cnt[idx] are all individually 0 and the sum of all ups[idx] is
equal to the sum of all downs[idx].
If you create an operational model of up() and down() in terms of such a
per-index semaphore, I think the x==1 case would similarly need to be
forbidden. Since the grace period must end after P1's grace period, and
P0's and P1's grace period overlap and use the same semaphore, the count
is never 0 at any point in time either P0 or P1 are in the grace period,
and so the grace period must also end after P0's grace period. But then
x=1 can't yet have propagated to P0 when it reads x inside its grace period.
In contrast, if the operational model of lock() and unlock() is a
per-index and per-cpu count, then the x==1 case would be allowed, e.g.,
as follows (time from left to right, all processes happen in parallel):
P0: < Rx1 >
P1: < Ry0 >
P2: y=1 < P0! P1! > x=1
here < and > mark the start and end of cs and gp, and Pn! is the time
the gp realizes that Pn was not in a cs.
Best wishes,
jonas
Powered by blists - more mailing lists