[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240214185649.96945-1-kuniyu@amazon.com>
Date: Wed, 14 Feb 2024 10:56:49 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <edumazet@...gle.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <joannelkoong@...il.com>,
<kuba@...nel.org>, <kuni1840@...il.com>, <kuniyu@...zon.com>,
<netdev@...r.kernel.org>, <pabeni@...hat.com>, <syzkaller@...glegroups.com>
Subject: Re: [PATCH v2 net] dccp/tcp: Unhash sk from ehash for tb2 alloc failure after check_estalblished().
From: Eric Dumazet <edumazet@...gle.com>
Date: Wed, 14 Feb 2024 10:05:33 +0100
> On Tue, Feb 13, 2024 at 10:42 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> >
> > syzkaller reported a warning [0] in inet_csk_destroy_sock() with no
> > repro.
> >
> > WARN_ON(inet_sk(sk)->inet_num && !inet_csk(sk)->icsk_bind_hash);
> >
> > However, the syzkaller's log hinted that connect() failed just before
> > the warning due to FAULT_INJECTION. [1]
> >
> > When connect() is called for an unbound socket, we search for an
> > available ephemeral port. If a bhash bucket exists for the port, we
> > call __inet_check_established() or __inet6_check_established() to check
> > if the bucket is reusable.
> >
> > If reusable, we add the socket into ehash and set inet_sk(sk)->inet_num.
> >
> > Later, we look up the corresponding bhash2 bucket and try to allocate
> > it if it does not exist.
> >
> > Although it rarely occurs in real use, if the allocation fails, we must
> > revert the changes by check_established(). Otherwise, an unconnected
> > socket could illegally occupy an ehash entry.
> >
> > Note that we do not put tw back into ehash because sk might have
> > already responded to a packet for tw and it would be better to free
> > tw earlier under such memory presure.
> >
> > [0]:
> >
> > Reported-by: syzkaller <syzkaller@...glegroups.com>
> > Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > ---
> > v2:
> > * Unhash twsk from bhash/bhash2
> >
> > v1: https://lore.kernel.org/netdev/20240209025409.27235-1-kuniyu@amazon.com/
> > ---
> > net/ipv4/inet_hashtables.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> > index 93e9193df544..b22c71f93297 100644
> > --- a/net/ipv4/inet_hashtables.c
> > +++ b/net/ipv4/inet_hashtables.c
> > @@ -1130,10 +1130,31 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
> > return 0;
> >
> > error:
> > + if (sk_hashed(sk)) {
> > + spinlock_t *lock = inet_ehash_lockp(hinfo, sk->sk_hash);
> > +
> > + sock_prot_inuse_add(net, sk->sk_prot, -1);
> > +
> > + spin_lock(lock);
> > + sk_nulls_del_node_init_rcu(sk);
> > + spin_unlock(lock);
> > +
> > + sk->sk_hash = 0;
> > + inet_sk(sk)->inet_sport = 0;
> > + inet_sk(sk)->inet_num = 0;
> > +
> > + if (tw)
> > + inet_twsk_bind_unhash(tw, hinfo);
> > + }
> > +
> > spin_unlock(&head2->lock);
> > if (tb_created)
> > inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb);
> > spin_unlock_bh(&head->lock);
> > +
> > + if (tw)
> > + inet_twsk_deschedule_put(tw);
>
> Please make sure to call this while BH is still disabled.
>
>
> spin_unlock(&head->lock);
> if (tw)
> inet_twsk_deschedule_put(tw);
> local_bh_enable();
Sure, will update in v3 and post a followup for the existing
inet_twsk_deschedule_put() to net-next later.
Thanks!
Powered by blists - more mailing lists