netdev - Re: [PATCH net-next] unix: Guarantee sk_state relevance in case of it was assigned by a task on other cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230126133322.3bfab5e0@kernel.org>
Date:   Thu, 26 Jan 2023 13:33:22 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Kirill Tkhai <tkhai@...ru>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
        kuniyu@...zon.com, gorcunov@...il.com
Subject: Re: [PATCH net-next] unix: Guarantee sk_state relevance in case of
 it was assigned by a task on other cpu

On Thu, 26 Jan 2023 12:25:11 -0800 Paul E. McKenney wrote:
> > Me trying to prove that memory ordering is transitive would be 100%
> > speculation. Let's ask Paul instead - is the above valid? Or the fact
> > that CPU1 observes state from CPU0 and is strongly ordered with CPU2
> > implies that CPU2 will also observe CPU0's state?  
> 
> Hmmm...  What is listen() doing?  There seem to be a lot of them
> in the kernel.
> 
> But proceeding on first principles...
> 
> Sometimes.  Memory ordering is transitive only when the ordering is
> sufficiently strong.
> 
> In this case, I do not see any ordering between CPU 0 and anything else.
> If the listen() function were to acquire the same mutex as CPU1 and CPU2
> did, and if it acquired it first, then CPU2 would be guaranteed to see
> anything CPU0 did while holding that mutex.

The fuller picture would be:

[CPU0]                     [CPU1]                [CPU2]
WRITE_ONCE(sk->sk_state,
           TCP_LISTEN);
                           val = READ_ONCE(sk->sk_state) 
                           mutex_lock()
                           shared_mem_var = val
                           mutex_unlock()
                                                  mutex_lock()
                                                  if (shared_mem_var == TCP_LISTEN)
                                                     BUG_ON(READ_ONCE(sk->sk_state)
                                                            != TCP_LISTEN)
                                                  mutex_unlock()

> Alternatively, if CPU0 wrote to some memory, and CPU1 read that value
> before releasing the mutex (including possibly before acquiring that
> mutex), then CPU2 would be guaranteed to see that value (or the value
> written by some later write to that same memory) after acquiring that
> mutex.

Which I believe is exactly what happens in the example.

> So here are some things you can count on transitively:
> 
> 1.	After acquiring a given lock (or mutex or whatever), you will
> 	see any values written or read prior to any earlier conflicting
> 	release of that same lock.
> 
> 2.	After an access with acquire semantics (for example,
> 	smp_load_acquire()) you will see any values written or read
> 	prior to any earlier access with release semantics (for example,
> 	smp_store_release()).
> 
> Or in all cases, you might see later values, in case someone else also
> did a write to the location in question.
> 
> Does that help, or am I missing a turn in there somewhere?

Very much so, thank you!