[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0b9db623-0a69-30e6-1e28-b6acb306c360@gmail.com>
Date: Sun, 1 Mar 2020 19:42:25 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: Kuniyuki Iwashima <kuniyu@...zon.co.jp>, davem@...emloft.net,
kuznet@....inr.ac.ru, yoshfuji@...ux-ipv6.org, edumazet@...gle.com
Cc: kuni1840@...il.com, netdev@...r.kernel.org,
osa-contribution-log@...zon.com
Subject: Re: [PATCH v3 net-next 2/4] tcp: bind(addr, 0) remove the
SO_REUSEADDR restriction when ephemeral ports are exhausted.
On 2/29/20 3:35 AM, Kuniyuki Iwashima wrote:
> Commit aacd9289af8b82f5fb01bcdd53d0e3406d1333c7 ("tcp: bind() use stronger
> condition for bind_conflict") introduced a restriction to forbid to bind
> SO_REUSEADDR enabled sockets to the same (addr, port) tuple in order to
> assign ports dispersedly so that we can connect to the same remote host.
>
> The change results in accelerating port depletion so that we fail to bind
> sockets to the same local port even if we want to connect to the different
> remote hosts.
>
> You can reproduce this issue by following instructions below.
> 1. # sysctl -w net.ipv4.ip_local_port_range="32768 32768"
> 2. set SO_REUSEADDR to two sockets.
> 3. bind two sockets to (address, 0) and the latter fails.
>
> Therefore, when ephemeral ports are exhausted, bind(addr, 0) should
> fallback to the legacy behaviour to enable the SO_REUSEADDR option and make
> it possible to connect to different remote (addr, port) tuples.
>
> This patch allows us to bind SO_REUSEADDR enabled sockets to the same
> (addr, port) only when all ephemeral ports are exhausted.
>
> The only notable thing is that if all sockets bound to the same port have
> both SO_REUSEADDR and SO_REUSEPORT enabled, we can bind sockets to an
> ephemeral port and also do listen().
>
> Fixes: aacd9289af8b ("tcp: bind() use stronger condition for bind_conflict")
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.co.jp>
I am unsure about this, since this could double the time taken by this
function, which is already very time consuming.
We added years ago IP_BIND_ADDRESS_NO_PORT socket option, so that the kernel
has more choices at connect() time (instead of bind()) time to choose a source port.
This considerably lowers time taken to find an optimal source port, since
the kernel has full information (source address, destination address & port)
IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
Inform the kernel to not reserve an ephemeral port when using
bind(2) with a port number of 0. The port will later be auto‐
matically chosen at connect(2) time, in a way that allows
sharing a source port as long as the 4-tuple is unique.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=90c337da1524863838658078ec34241f45d8394d
Powered by blists - more mailing lists