linux-kernel - Re: [PATCH v2 2/2] tools/memory-model: Make ppo a subrelation of po

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47acbaa7-8280-48f2-678f-53762cf3fe9d@huaweicloud.com>
Date:   Fri, 27 Jan 2023 15:31:25 +0100
From:   Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
        peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
        dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
        akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
        urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] tools/memory-model: Make ppo a subrelation of po



On 1/26/2023 5:36 PM, Alan Stern wrote:
> On Thu, Jan 26, 2023 at 02:46:04PM +0100, Jonas Oberhauser wrote:
>> As stated in the documentation and implied by its name, the ppo
>> (preserved program order) relation is intended to link po-earlier
>> to po-later instructions under certain conditions.  However, a
>> corner case currently allows instructions to be linked by ppo that
>> are not executed by the same thread, i.e., instructions are being
>> linked that have no po relation.
>>
>> This happens due to the mb/strong-fence relations, which (as one
>> case) provide order when locks are passed between threads followed
>> by an smp_mb__after_unlock_lock() fence.  This is illustrated in
>> the following litmus test (as can be seen when using herd7 with
>> `doshow ppo`):
>>
>> P0(int *x, int *y)
>> {
>>      spin_lock(x);
>>      spin_unlock(x);
>> }
>>
>> P1(int *x, int *y)
>> {
>>      spin_lock(x);
>>      smp_mb__after_unlock_lock();
>>      *y = 1;
>> }
>>
>> The ppo relation will link P0's spin_lock(x) and P1's *y=1, because
>> P0 passes a lock to P1 which then uses this fence.
>>
>> The patch makes ppo a subrelation of po by eliminating this possibility
>> from mb (but not strong-fence) and relying explicitly on mb|gp instead
>> of strong-fence when defining ppo.
>>
>> Signed-off-by: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
>> ---
> This changes the meaning of the fence relation, which is used in
> w-pre-bounded, w-post-bounded, ww-vis, wr-vis, and rw-xbstar.  Have you
> checked that they won't be affected by the change?

Good point, in fact it changes the nonrw-fence as well which is used in 
the r-* relations.
I had missed this completely. That's what I get for not looking at the 
data race relations!


Let's go through the individual cases.
If we had e.g. *-post-bounded because of the 
po-unlock-lock-po;[After-unlock-lock];po edge, this may be either due to 
the fence rule or due to (...fence ; vis ; *-pre-bounded).
In the first case we have po-rel ; rfe ; acq-po and get 
fence;rfe;(xbstar&int);fence which gives us *-post-bounded.
In the second case we now have strong-fence, with vis <= xbstar we see 
that **-vis is preserved here by switching from the fence rule to the 
strong-fence;xbstar;... case.

For *-pre-bounded, the situtation is tricky because of the xbstar&int 
that can be at the end of vis, when *-pre-bounded is used to define 
w*-vis through (w-post-bounded;vis;*-pre-bounded). In this case, the 
xbstar&int can point in the opposite direction of po, which means that 
the unlock that creates the strong fence might not be po-after the 
instruction that starts the link.

Here's a litmus test illustrating the difference, where P1 has a 
backwards-pointing xbstar&int. Currently there's no data race, but with 
the proposed patch there is.

P0(int *x, int *y)
{
     *x = 1;
     smp_store_release(y, 1);
}

P1(int *x, int *y, int *dx, int *dy, spinlock_t *l)
{
     spin_lock(l);
     int r1 = READ_ONCE(*dy);
     if (r1==1)
         spin_unlock(l);

     int r0 = smp_load_acquire(y);
     if (r0 == 1) {
         WRITE_ONCE(*dx,1);
     }
}

P2(int *dx, int *dy)
{
     WRITE_ONCE(*dy,READ_ONCE(*dx));
}


P3(int *x, spinlock_t *l)
{
     spin_lock(l);
     smp_mb__after_unlock_lock();
     *x = 2;
}


This actually makes me wonder. I thought the reason for the xbstar & int 
is that it ensures that the overall relation, after shuffling around a 
little bit, becomes prop&int ; hb*.

Like in case the *x=2 is co-before the *x=1, we get
   Wx2 ->overwrite Wx1 ->cumul-fence*;rfe  (some event on the same CPU 
as Wx2)  ->fence Wx2
which is
   Wx2 -> prop&int some other event ->hb Wx2
which must be irreflexive.

However, that's not the case at all, because the fence relation 
currently doesn't actually have to relate events of the same CPU.
So we get
   Wx2 ->overwrite Wx1 ->cumul-fence*;rfe  (some event on some other CPU 
than Wx2's) ->(hb*&int);fence Wx2
i.e.,
   Wx2 ->prop&ext;hb*;strong-fence Wx2

which shouldn't provide any ordering in general.

In fact, replacing the *x=1 and *x=2 with WRITE_ONCEs, (pilot errors 
notwithstanding) both Wx1 ->co Wx2 and Wx2 ->co Wx1 become allowed in 
the current LKMM (in graphs where all other edges are equal).

Shouldn't this actually *be* a data race? And potentially the same with 
rcu-fence?

The other cases of *-pre-bounded seem to work out: they all link stuff 
via xbstar to the instruction that is linked via po-unlock-lock-po ; 
[After-unlock-lock] ; po to the potentially racy access, and 
po-unlock-lock-po;po   is xbstar ; acq-po, which allows closing the gap.

Best wishes, jonas