[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240214190701.97643-1-kuniyu@amazon.com>
Date: Wed, 14 Feb 2024 11:07:01 -0800
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <kuniyu@...zon.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <edumazet@...gle.com>,
<joannelkoong@...il.com>, <kuba@...nel.org>, <kuni1840@...il.com>,
<netdev@...r.kernel.org>, <pabeni@...hat.com>, <syzkaller@...glegroups.com>
Subject: Re: [PATCH v2 net] dccp/tcp: Unhash sk from ehash for tb2 alloc failure after check_estalblished().
From: Kuniyuki Iwashima <kuniyu@...zon.com>
Date: Wed, 14 Feb 2024 10:56:49 -0800
> From: Eric Dumazet <edumazet@...gle.com>
> Date: Wed, 14 Feb 2024 10:05:33 +0100
> > On Tue, Feb 13, 2024 at 10:42 PM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> > >
> > > syzkaller reported a warning [0] in inet_csk_destroy_sock() with no
> > > repro.
> > >
> > > WARN_ON(inet_sk(sk)->inet_num && !inet_csk(sk)->icsk_bind_hash);
> > >
> > > However, the syzkaller's log hinted that connect() failed just before
> > > the warning due to FAULT_INJECTION. [1]
> > >
> > > When connect() is called for an unbound socket, we search for an
> > > available ephemeral port. If a bhash bucket exists for the port, we
> > > call __inet_check_established() or __inet6_check_established() to check
> > > if the bucket is reusable.
> > >
> > > If reusable, we add the socket into ehash and set inet_sk(sk)->inet_num.
> > >
> > > Later, we look up the corresponding bhash2 bucket and try to allocate
> > > it if it does not exist.
> > >
> > > Although it rarely occurs in real use, if the allocation fails, we must
> > > revert the changes by check_established(). Otherwise, an unconnected
> > > socket could illegally occupy an ehash entry.
> > >
> > > Note that we do not put tw back into ehash because sk might have
> > > already responded to a packet for tw and it would be better to free
> > > tw earlier under such memory presure.
> > >
> > > [0]:
> > >
> > > Reported-by: syzkaller <syzkaller@...glegroups.com>
> > > Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> > > ---
> > > v2:
> > > * Unhash twsk from bhash/bhash2
> > >
> > > v1: https://lore.kernel.org/netdev/20240209025409.27235-1-kuniyu@amazon.com/
> > > ---
> > > net/ipv4/inet_hashtables.c | 21 +++++++++++++++++++++
> > > 1 file changed, 21 insertions(+)
> > >
> > > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> > > index 93e9193df544..b22c71f93297 100644
> > > --- a/net/ipv4/inet_hashtables.c
> > > +++ b/net/ipv4/inet_hashtables.c
> > > @@ -1130,10 +1130,31 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
> > > return 0;
> > >
> > > error:
> > > + if (sk_hashed(sk)) {
> > > + spinlock_t *lock = inet_ehash_lockp(hinfo, sk->sk_hash);
> > > +
> > > + sock_prot_inuse_add(net, sk->sk_prot, -1);
> > > +
> > > + spin_lock(lock);
> > > + sk_nulls_del_node_init_rcu(sk);
> > > + spin_unlock(lock);
> > > +
> > > + sk->sk_hash = 0;
> > > + inet_sk(sk)->inet_sport = 0;
> > > + inet_sk(sk)->inet_num = 0;
> > > +
> > > + if (tw)
> > > + inet_twsk_bind_unhash(tw, hinfo);
> > > + }
> > > +
> > > spin_unlock(&head2->lock);
> > > if (tb_created)
> > > inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb);
> > > spin_unlock_bh(&head->lock);
> > > +
> > > + if (tw)
> > > + inet_twsk_deschedule_put(tw);
> >
> > Please make sure to call this while BH is still disabled.
> >
> >
> > spin_unlock(&head->lock);
> > if (tw)
> > inet_twsk_deschedule_put(tw);
> > local_bh_enable();
>
> Sure, will update in v3 and post a followup for the existing
> inet_twsk_deschedule_put() to net-next later.
The existing one was doing that, it seems somehow I copied the
different code elsewhere ... :p
Powered by blists - more mailing lists