[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221121171602.31927-1-kuniyu@amazon.com>
Date: Mon, 21 Nov 2022 09:16:02 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <tkhai@...ru>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>
Subject: Re: [PATCH net] unix: Fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
From: Kirill Tkhai <tkhai@...ru>
Date: Sun, 20 Nov 2022 14:43:06 +0300
> On 20.11.2022 12:46, Kirill Tkhai wrote:
> > On 20.11.2022 12:09, Kuniyuki Iwashima wrote:
> >> From: Kirill Tkhai <tkhai@...ru>
> >> Date: Sun, 20 Nov 2022 02:16:47 +0300
> >>> There is a race resulting in alive SOCK_SEQPACKET socket
> >>> may change its state from TCP_ESTABLISHED to TCP_CLOSE:
> >>>
> >>> unix_release_sock(peer) unix_dgram_sendmsg(sk)
> >>> sock_orphan(peer)
> >>> sock_set_flag(peer, SOCK_DEAD)
> >>> sock_alloc_send_pskb()
> >>> if !(sk->sk_shutdown & SEND_SHUTDOWN)
> >>> OK
> >>> if sock_flag(peer, SOCK_DEAD)
> >>> sk->sk_state = TCP_CLOSE
> >>> sk->sk_shutdown = SHUTDOWN_MASK
> >>>
> >>>
> >>> After that socket sk remains almost normal: it is able to connect, listen, accept
> >>> and recvmsg, while it can't sendmsg.
> >>
> >> nit: Then, also recvmsg() fails with -ENOTCONN. And after connect(), even
> >> both of recvmsg() and sendmsg() does not fail.
> >
> > Possible, I wrote not clear. I mean sendmsg fails after connect, while recvmsg does not fail :)
> > Here is sendmsg abort path:
> >
> > unix_dgram_sendmsg()
> > sock_alloc_send_pskb()
> > if (sk->sk_shutdown & SEND_SHUTDOWN)
> > FAIL
> >
> >> static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
> >> size_t size, int flags)
> >> {
> >> struct sock *sk = sock->sk;
> >>
> >> if (sk->sk_state != TCP_ESTABLISHED)
> >> return -ENOTCONN;
> >>
> >> return unix_dgram_recvmsg(sock, msg, size, flags);
> >> }
> >>
> >>
> >>>
> >>> Since this is the only possibility for alive SOCK_SEQPACKET to change
> >>> the state in such way, we should better fix this strange and potentially
> >>> danger corner case.
> >>>
> >>> Also, move TCP_CLOSE assignment for SOCK_DGRAM sockets under state lock.
> >>>
> >>> Signed-off-by: Kirill Tkhai <tkhai@...ru>
> >>
> >> Fixes tag is needed:
> >>
> >> Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
> >>> Before this commit, there was no state change and SEQPACKET sk also went
> >> through the same path. The bug was introduced because the commit did not
> >> consider SEAPACKET.
> >>
> >> So, I think the fix should be like below, then we can free the peer faster.
> >> Note unix_dgram_peer_wake_disconnect_wakeup() is dgram-specific too.
> >>
> >> ---8<---
> >> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >> index b3545fc68097..be40023a61fb 100644
> >> --- a/net/unix/af_unix.c
> >> +++ b/net/unix/af_unix.c
> >> @@ -2001,11 +2001,14 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
> >> err = 0;
> >> if (unix_peer(sk) == other) {
> >> unix_peer(sk) = NULL;
> >> - unix_dgram_peer_wake_disconnect_wakeup(sk, other);
> >> +
> >> + if (sk->sk_type == SOCK_DGRAM) {
> >> + unix_dgram_peer_wake_disconnect_wakeup(sk, other);
> >> + sk->sk_state = TCP_CLOSE;
> >> + }
> >>
> >> unix_state_unlock(sk);
> >>
> >> - sk->sk_state = TCP_CLOSE;
> >> unix_dgram_disconnected(sk, other);
> >> sock_put(other);
> >> err = -ECONNREFUSED;
> >> ---8<---
> >>
> >> Also, it's better to mention that moving TCP_CLOSE under the lock resolves
> >> another rare race with unix_dgram_connect() for DGRAM sk:
> >>
> >> unix_state_unlock(sk);
> >> <--------------------------> connect() could set TCP_ESTABLISHED here.
> >> sk->sk_state = TCP_CLOSE;
> >
> > Sounds good, I'll test this variant. Thanks!
>
> Thinking again, I'd argue about disconnecting from peer (unix_peer(sk) = NULL) here.
> Normally, unix_dgram_sendmsg() never came here like I wrote:
>
> unix_dgram_sendmsg()
> sock_alloc_send_pskb()
> if (sk->sk_shutdown & SEND_SHUTDOWN)
> FAIL with EPIPE
>
> So, this optimization will work very rare, it fact there will not real performance profit.
> But this creates a cornet case (safe but corner), which seems not good. Corner cases are not
> good in general.
>
> I'd leave an original variant. What do you think about this?
I think there is no good reason to delay freeing unused memory, not only
sock_put(other), unix_dgram_disconnected() frees sk->sk_receive_queue.
And if peer is cleared, we need not call sock_alloc_send_pskb() and
return earlier with -ENOTCONN in the following sendmsg()s. It's easy
to read because we need not dive into another layer implemented in
sock_alloc_send_pskb().
Also, at least, the code has been safe for decades, and if we don't
clear peer and sk_receive_queue, we have to check other places if
there are sanity checks for such cases. IMHO, we should minimize the
risk if the patch is for -stable.
Thank you.
>
> Kirill
>
> >>> ---
> >>> net/unix/af_unix.c | 11 +++++++++--
> >>> 1 file changed, 9 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> >>> index b3545fc68097..6fd745cfc492 100644
> >>> --- a/net/unix/af_unix.c
> >>> +++ b/net/unix/af_unix.c
> >>> @@ -1999,13 +1999,20 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
> >>> unix_state_lock(sk);
> >>>
> >>> err = 0;
> >>> - if (unix_peer(sk) == other) {
> >>> + if (sk->sk_type == SOCK_SEQPACKET) {
> >>> + /* We are here only when racing with unix_release_sock()
> >>> + * is clearing @other. Never change state to TCP_CLOSE
> >>> + * unlike SOCK_DGRAM wants.
> >>> + */
> >>> + unix_state_unlock(sk);
> >>> + err = -EPIPE;
> >>> + } else if (unix_peer(sk) == other) {
> >>> unix_peer(sk) = NULL;
> >>> unix_dgram_peer_wake_disconnect_wakeup(sk, other);
> >>>
> >>> + sk->sk_state = TCP_CLOSE;
> >>> unix_state_unlock(sk);
> >>>
> >>> - sk->sk_state = TCP_CLOSE;
> >>> unix_dgram_disconnected(sk, other);
> >>> sock_put(other);
> >>> err = -ECONNREFUSED;
Powered by blists - more mailing lists