[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200310074122.68021-1-kuniyu@amazon.co.jp>
Date: Tue, 10 Mar 2020 16:41:22 +0900
From: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
To: <eric.dumazet@...il.com>
CC: <davem@...emloft.net>, <edumazet@...gle.com>, <kuni1840@...il.com>,
<kuniyu@...zon.co.jp>, <kuznet@....inr.ac.ru>,
<netdev@...r.kernel.org>, <osa-contribution-log@...zon.com>,
<yoshfuji@...ux-ipv6.org>
Subject: Re: [PATCH v4 net-next 2/5] tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.
From: Eric Dumazet <eric.dumazet@...il.com>
Date: Mon, 9 Mar 2020 21:04:24 -0700
> On 3/8/20 11:16 AM, Kuniyuki Iwashima wrote:
> > Commit aacd9289af8b82f5fb01bcdd53d0e3406d1333c7 ("tcp: bind() use stronger
> > condition for bind_conflict") introduced a restriction to forbid to bind
> > SO_REUSEADDR enabled sockets to the same (addr, port) tuple in order to
> > assign ports dispersedly so that we can connect to the same remote host.
> >
> > The change results in accelerating port depletion so that we fail to bind
> > sockets to the same local port even if we want to connect to the different
> > remote hosts.
> >
> > You can reproduce this issue by following instructions below.
> > 1. # sysctl -w net.ipv4.ip_local_port_range="32768 32768"
> > 2. set SO_REUSEADDR to two sockets.
> > 3. bind two sockets to (localhost, 0) and the latter fails.
> >
> > Therefore, when ephemeral ports are exhausted, bind(0) should fallback to
> > the legacy behaviour to enable the SO_REUSEADDR option and make it possible
> > to connect to different remote (addr, port) tuples.
>
> Sadly this commit tries hard to support obsolete SO_REUSEADDR for active connections,
> which makes little sense now we have more powerful IP_BIND_ADDRESS_NO_PORT
>
> SO_REUSEADDR only really makes sense for a listener, because you want a
> server to be able to restart after core dump, while prior sockets are still
> kept in TIME_WAIT state.
>
> Same for SO_REUSEPORT : it only made sense for sharded listeners in linux kernel.
>
> Trying to allocate a sport at bind() time, without knowing the destination address/port
> is really not something that can be fixed.
>
> Your patches might allow a 2x increase, while IP_BIND_ADDRESS_NO_PORT
> basically allows for 1000x increase of the possible combinations.
>
>
>
> >
> > This patch allows us to bind SO_REUSEADDR enabled sockets to the same
> > (addr, port) only when all ephemeral ports are exhausted.
> >
> > The only notable thing is that if all sockets bound to the same port have
> > both SO_REUSEADDR and SO_REUSEPORT enabled, we can bind sockets to an
> > ephemeral port and also do listen().
> >
> > Fixes: aacd9289af8b ("tcp: bind() use stronger condition for bind_conflict")
>
> I disagree with this Fixes: tag : I do not want this patch in stable kernels,
> particularly if you put the sysctl patch as a followup without a Fixes: tag.
>
> Please reorder your patch to first introduce the sysctl, then this one.
>
> Or squash the two patches.
I'm sorry, I will remove the tag and squash the patches.
Powered by blists - more mailing lists