[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230922174718.42473-1-kuniyu@amazon.com>
Date: Fri, 22 Sep 2023 10:47:18 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <gnault@...hat.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <edumazet@...gle.com>,
<kuba@...nel.org>, <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<kuniyu@...zon.com>
Subject: Re: [PATCH net-next] tcp: Dump bound-only sockets in inet_diag.
From: Guillaume Nault <gnault@...hat.com>
Date: Fri, 22 Sep 2023 18:59:57 +0200
> Walk the hashinfo->bhash table so that inet_diag can dump TCP sockets
I think we should use bhash2 as bhash could be long enough for reuseport
listeners. That's why bhash2 is introduced.
> that are bound but haven't yet called connect() or listen().
>
> This allows ss to dump bound-only TCP sockets, together with listening
> sockets (as there's no specific state for bound-only sockets). This is
> similar to the UDP behaviour for which bound-only sockets are already
> dumped by ss -lu.
>
> The code is inspired by the ->lhash2 loop. However there's no manual
> test of the source port, since this kind of filtering is already
> handled by inet_diag_bc_sk().
>
> No change is needed for ss. With an IPv4, an IPv6 and an IPv6-only
> socket, bound respectively to 40000, 64000, 60000, the result is:
>
> $ ss -lt
> State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
> UNCONN 0 0 0.0.0.0:40000 0.0.0.0:*
> UNCONN 0 0 [::]:60000 [::]:*
> UNCONN 0 0 *:64000 *:*
>
> Signed-off-by: Guillaume Nault <gnault@...hat.com>
> ---
> net/ipv4/inet_diag.c | 54 ++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
> index e13a84433413..de9c0c8cf42b 100644
> --- a/net/ipv4/inet_diag.c
> +++ b/net/ipv4/inet_diag.c
> @@ -1077,6 +1077,60 @@ void inet_diag_dump_icsk(struct inet_hashinfo *hashinfo, struct sk_buff *skb,
> s_i = num = s_num = 0;
> }
>
> + /* Dump bound-only sockets */
> + if (cb->args[0] == 1) {
> + if (!(idiag_states & TCPF_CLOSE))
> + goto skip_bind_ht;
> +
> + for (i = s_i; i <= hashinfo->bhash_size; i++) {
> + struct inet_bind_hashbucket *ibb;
> + struct inet_bind_bucket *tb;
> +
> + num = 0;
> + ibb = &hashinfo->bhash[i];
> +
> + spin_lock_bh(&ibb->lock);
> + inet_bind_bucket_for_each(tb, &ibb->chain) {
> + if (!net_eq(ib_net(tb), net))
> + continue;
> +
> + sk_for_each_bound(sk, &tb->owners) {
> + struct inet_sock *inet = inet_sk(sk);
> +
> + if (num < s_num)
> + goto next_bind;
> +
> + if (sk->sk_state != TCP_CLOSE ||
> + !inet->inet_num)
> + goto next_bind;
> +
> + if (r->sdiag_family != AF_UNSPEC &&
> + r->sdiag_family != sk->sk_family)
> + goto next_bind;
> +
> + if (!inet_diag_bc_sk(bc, sk))
> + goto next_bind;
> +
> + if (inet_sk_diag_fill(sk, NULL, skb,
> + cb, r,
> + NLM_F_MULTI,
> + net_admin) < 0) {
> + spin_unlock_bh(&ibb->lock);
> + goto done;
> + }
> +next_bind:
> + num++;
> + }
> + }
> + spin_unlock_bh(&ibb->lock);
Here we should add cond_resched(), otherwise syzbot could abuse this
and report hung task.
> +
> + s_num = 0;
> + }
> +skip_bind_ht:
> + cb->args[0] = 2;
> + s_i = num = s_num = 0;
> + }
> +
> if (!(idiag_states & ~TCPF_LISTEN))
> goto out;
>
> --
> 2.39.2
Powered by blists - more mailing lists