netdev - Re: [PATCH net-next] unix: Guarantee sk_state relevance in case of it was assigned by a task on other cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230126202511.GL2948950@paulmck-ThinkPad-P17-Gen-1>
Date:   Thu, 26 Jan 2023 12:25:11 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Kirill Tkhai <tkhai@...ru>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
        kuniyu@...zon.com, gorcunov@...il.com
Subject: Re: [PATCH net-next] unix: Guarantee sk_state relevance in case of
 it was assigned by a task on other cpu

On Wed, Jan 25, 2023 at 10:10:53PM -0800, Jakub Kicinski wrote:
> On Thu, 26 Jan 2023 00:09:08 +0300 Kirill Tkhai wrote:
> > 1)There are a many combinations with third task involved:
> > 
> > [CPU0:Task0]  [CPU1:Task1]                           [CPU2:Task2]
> > listen(sk)
> >               kernel:
> >                 sk_diag_fill(sk)
> >                   rep->udiag_state = TCP_LISTEN
> >                 return_from_syscall
> >               userspace:
> >                 mutex_lock()
> >                 shared_mem_var = rep->udiag_state 
> >                 mutex_unlock()
> > 
> >                                                      userspace: 
> >                                                        mutex_lock()
> >                                                        if (shared_mem_var == TCP_LISTEN)
> >                                                          accept(sk); /* -> fail, since sk_state is not visible */
> >                                                        mutex_unlock()
> > 
> > In this situation Task2 definitely knows Task0's listen() has succeed, but there is no a possibility
> > to guarantee its accept() won't fail. Despite there are appropriate barriers in mutex_lock() and mutex_unlock(),
> > there is no a possibility to add a barrier on CPU1 to make Task0's store visible on CPU2.
> 
> Me trying to prove that memory ordering is transitive would be 100%
> speculation. Let's ask Paul instead - is the above valid? Or the fact
> that CPU1 observes state from CPU0 and is strongly ordered with CPU2
> implies that CPU2 will also observe CPU0's state?

Hmmm...  What is listen() doing?  There seem to be a lot of them
in the kernel.

But proceeding on first principles...

Sometimes.  Memory ordering is transitive only when the ordering is
sufficiently strong.

In this case, I do not see any ordering between CPU 0 and anything else.
If the listen() function were to acquire the same mutex as CPU1 and CPU2
did, and if it acquired it first, then CPU2 would be guaranteed to see
anything CPU0 did while holding that mutex.

Alternatively, if CPU0 wrote to some memory, and CPU1 read that value
before releasing the mutex (including possibly before acquiring that
mutex), then CPU2 would be guaranteed to see that value (or the value
written by some later write to that same memory) after acquiring that
mutex.

So here are some things you can count on transitively:

1.	After acquiring a given lock (or mutex or whatever), you will
	see any values written or read prior to any earlier conflicting
	release of that same lock.

2.	After an access with acquire semantics (for example,
	smp_load_acquire()) you will see any values written or read
	prior to any earlier access with release semantics (for example,
	smp_store_release()).

Or in all cases, you might see later values, in case someone else also
did a write to the location in question.

Does that help, or am I missing a turn in there somewhere?

							Thanx, Paul