[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b34c251-05af-06c0-8003-858b6ae8d1fd@gmail.com>
Date: Tue, 1 Dec 2020 16:13:39 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Kuniyuki Iwashima <kuniyu@...zon.co.jp>,
"David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>
Cc: Benjamin Herrenschmidt <benh@...zon.com>,
Kuniyuki Iwashima <kuni1840@...il.com>,
osa-contribution-log@...zon.com, bpf@...r.kernel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 bpf-next 05/11] tcp: Migrate TCP_NEW_SYN_RECV requests.
On 12/1/20 3:44 PM, Kuniyuki Iwashima wrote:
> This patch renames reuseport_select_sock() to __reuseport_select_sock() and
> adds two wrapper function of it to pass the migration type defined in the
> previous commit.
>
> reuseport_select_sock : BPF_SK_REUSEPORT_MIGRATE_NO
> reuseport_select_migrated_sock : BPF_SK_REUSEPORT_MIGRATE_REQUEST
>
> As mentioned before, we have to select a new listener for TCP_NEW_SYN_RECV
> requests at receiving the final ACK or sending a SYN+ACK. Therefore, this
> patch also changes the code to call reuseport_select_migrated_sock() even
> if the listening socket is TCP_CLOSE. If we can pick out a listening socket
> from the reuseport group, we rewrite request_sock.rsk_listener and resume
> processing the request.
>
> Reviewed-by: Benjamin Herrenschmidt <benh@...zon.com>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
> ---
> include/net/inet_connection_sock.h | 12 +++++++++++
> include/net/request_sock.h | 13 ++++++++++++
> include/net/sock_reuseport.h | 8 +++----
> net/core/sock_reuseport.c | 34 ++++++++++++++++++++++++------
> net/ipv4/inet_connection_sock.c | 13 ++++++++++--
> net/ipv4/tcp_ipv4.c | 9 ++++++--
> net/ipv6/tcp_ipv6.c | 9 ++++++--
> 7 files changed, 81 insertions(+), 17 deletions(-)
>
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index 2ea2d743f8fc..1e0958f5eb21 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -272,6 +272,18 @@ static inline void inet_csk_reqsk_queue_added(struct sock *sk)
> reqsk_queue_added(&inet_csk(sk)->icsk_accept_queue);
> }
>
> +static inline void inet_csk_reqsk_queue_migrated(struct sock *sk,
> + struct sock *nsk,
> + struct request_sock *req)
> +{
> + reqsk_queue_migrated(&inet_csk(sk)->icsk_accept_queue,
> + &inet_csk(nsk)->icsk_accept_queue,
> + req);
> + sock_put(sk);
> + sock_hold(nsk);
This looks racy to me. nsk refcount might be zero at this point.
If you think it can _not_ be zero, please add a big comment here,
because this would mean something has been done before reaching this function,
and this sock_hold() would be not needed in the first place.
There is a good reason reqsk_alloc() is using refcount_inc_not_zero().
> + req->rsk_listener = nsk;
> +}
> +
Honestly, this patch series looks quite complex, and finding a bug in the
very first function I am looking at is not really a good sign...
Powered by blists - more mailing lists