[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b20965f7-e251-4793-951e-f211d179dbba@suse.de>
Date: Tue, 3 Feb 2026 19:02:36 +0100
From: Fernando Fernandez Mancera <fmancera@...e.de>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, corbet@....net, ncardwell@...gle.com,
kuniyu@...gle.com, dsahern@...nel.org, idosch@...dia.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
Thorsten Toepper <thorsten.toepper@....com>
Subject: Re: [PATCH RFC net-next] inet: add ip_retry_random_port sysctl to
reduce sequential port retries
On 2/3/26 6:54 PM, Fernando Fernandez Mancera wrote:
> With the current port selection algorithm, ports after a reserved port
> or long time used port are used more often than others. This combines
> with cloud environments blocking connections between the application
> server and the database server if there was a previous connection with
> the same source port. This leads to connectivity problems between
> applications on cloud environments.
>
> The situation is that a source tuple is usable again after being closed
> for a maximum lifetime segment of two minutes while in the firewall it's
> still noted as existing for 60 minutes or longer. So in case that the
> port is reused for the same target tuple before the firewall cleans up,
> the connection will fail due to firewall interference which itself will
> reset the activity timeout in its own table. We understand the real
> issue here is that these firewalls cannot cope with standards-compliant
> port reuse. But this is a workaround for such situations and an
> improvement on the distribution of ports selected.
>
> The proposed solution is instead of incrementing the port number,
> performing a re-selection of a new random port within the remaining
> range. This solution is configured via sysctl new option
> "net.ipv4.ip_retry_random_port".
>
> The test run consists of two processes, a client and a server, and loops
> connect to the server sending some bytes back. The results we got are
> promising:
>
> Executed test: Current algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 14197718
> longest retry sequence: 5202
>
> Executed test: Proposed modified algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 3976671
> longest retry sequence: 12
>
> In addition, on graphs generated we can observe that the distribution of
> source ports is more even with the proposed patch.
>
> Signed-off-by: Fernando Fernandez Mancera <fmancera@...e.de>
> Tested-by: Thorsten Toepper <thorsten.toepper@....com>
> ---
> .../networking/net_cachelines/netns_ipv4_sysctl.rst | 1 +
> include/net/netns/ipv4.h | 1 +
> net/ipv4/inet_hashtables.c | 7 ++++++-
> net/ipv4/sysctl_net_ipv4.c | 7 +++++++
> 4 files changed, 15 insertions(+), 1 deletion(-)
>
I just noticed I didn't add the following diffs to the patch. Please
keep them on mind and sorry for the inconvenience.
diff --git a/Documentation/networking/ip-sysctl.rst
b/Documentation/networking/ip-sysctl.rst
index bc9a01606daf..e6ae9400332c 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -1610,6 +1610,17 @@ ip_local_reserved_ports - list of comma separated
ranges
Default: Empty
+ip_retry_random_port - BOOLEAN
+ Randomize the selection of a new port if a reserved port is hit
during
+ automatic port selection instead of incrementing the port number.
+
+ Possible values:
+
+ - 0 (disabled)
+ - 1 (enabled)
+
+ Default: 0 (disabled)
+
ip_unprivileged_port_start - INTEGER
This is a per-namespace sysctl. It defines the first
unprivileged port in the network namespace. Privileged ports
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 5eade7d9e4a2..32ca260701ba 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -828,6 +828,8 @@ static struct ctl_table ipv4_net_table[] = {
.data = &init_net.ipv4.sysctl_ip_retry_random_port,
.mode = 0644,
.proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
},
{
.procname = "ip_local_reserved_ports",
Powered by blists - more mailing lists