lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <daa60273-d01a-8fc5-5e26-e8fc9364c1d8@huaweicloud.com>
Date: Fri, 25 Oct 2024 11:32:20 +0200
From: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>
To: Andrea Parri <parri.andrea@...il.com>, puranjay@...nel.org,
 paulmck@...nel.org
Cc: bpf@...r.kernel.org, lkmm@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: Some observations (results) on BPF acquire and release

On 10/23/2024 7:47 PM, Andrea Parri wrote:
> Hi Puranjay and Paul,
> 
> I'm running some experiment on the (experimental) formalization of BPF
> acquire and release available from [1] and wanted to report about some
> (initial) observations for discussion and possibly future developments;
> apologies in advance for the relatively long email and any repetition.
> 
> 
> A first and probably most important observation is that the (current)
> formalization of acquire and release appears to be "too strong": IIUC,
> the simplest example/illustration for this is given by the following
> 
> BPF R+release+fence
> {
>   0:r2=x; 0:r4=y;
>   1:r2=y; 1:r4=x; 1:r6=l;
> }
>   P0                                 | P1                                         ;
>   r1 = 1                             | r1 = 2                                     ;
>   *(u32 *)(r2 + 0) = r1              | *(u32 *)(r2 + 0) = r1                      ;
>   r3 = 1                             | r5 = atomic_fetch_add((u32 *)(r6 + 0), r5) ;
>   store_release((u32 *)(r4 + 0), r3) | r3 = *(u32 *)(r4 + 0)                      ;
> exists ([y]=2 /\ 1:r3=0)
> 
> This "exists" condition is not satisfiable according to the BPF model;
> however, if we adopt the "natural"/intended(?) PowerPC implementations
> of the synchronization primitives above (aka, with store_release() -->
> LWSYNC and atomic_fetch_add() --> SYNC ; [...] ), then we see that the
> condition in question becomes (architecturally) satisfiable on PowerPC
> (although I'm not aware of actual observations on PowerPC hardware).

Are the resulting PPC tests available somewhere?

> 
> 
> At first, the previous observation (validated via simulations and later
> extended to similar but more complex scenarios ) made me believe that
> the BPF formalization of acquire and release could be strictly stronger
> than the corresponding LKMM formalization; but that is _not_ the case:
> 
> The following "exists" condition is satisfiable according to the BPF
> model (and it remains satisfiable even if the load_acquire() in P2 is
> paired with an additional store_release() in P1).  In contrast, the
> corresponding LKMM condition (e.g load_acquire() --> smp_load_acquire()
> and atomic_fetch_add() --> smp_mb()) is not satisfiable (in fact, the
> same conclusion holds even if the putative smp_load_acquire() in P2 is
> "replaced" with an smp_rmb() or with an address dependency).
> 
> BPF Z6.3+fence+fence+acquire
> {
>   0:r2=x; 0:r4=y; 0:r6=l;
>   1:r2=y; 1:r4=z; 1:r6=m;
>   2:r2=z; 2:r4=x;
> }
>   P0                                         | P1                                         | P2                                 ;
>   r1 = 1                                     | r1 = 2                                     | r1 = load_acquire((u32 *)(r2 + 0)) ;
>   *(u32 *)(r2 + 0) = r1                      | *(u32 *)(r2 + 0) = r1                      | r3 = *(u32 *)(r4 + 0)              ;
>   r5 = atomic_fetch_add((u32 *)(r6 + 0), r5) | r5 = atomic_fetch_add((u32 *)(r6 + 0), r5) |                                    ;
>   r3 = 1                                     | r3 = 1                                     |                                    ;
>   *(u32 *)(r4 + 0) = r3                      | *(u32 *)(r4 + 0) = r3                      |                                    ;
> exists ([y]=2 /\ 2:r1=1 /\ 2:r3=0)
> 
> 
> These remarks show that the proposed BPF formalization of acquire and
> release somehow, but substantially, diverged from the corresponding
> LKMM formalization.  My guess is that the divergences mentioned above
> were not (fully) intentional, or I'm wondering -- why not follow the
> latter (the LKMM's) more closely? -  This is probably the first question
> I would need to clarify before trying/suggesting modifications to the
> present formalizations.  ;-)  Thoughts?
> 
>    Andrea
> 
> 
> [1] https://github.com/puranjaymohan/herdtools7/commits/bpf_acquire_release/


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ