lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203175422.4620-1-fmancera@suse.de>
Date: Tue,  3 Feb 2026 18:54:22 +0100
From: Fernando Fernandez Mancera <fmancera@...e.de>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net,
	edumazet@...gle.com,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org,
	corbet@....net,
	ncardwell@...gle.com,
	kuniyu@...gle.com,
	dsahern@...nel.org,
	idosch@...dia.com,
	linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Fernando Fernandez Mancera <fmancera@...e.de>,
	Thorsten Toepper <thorsten.toepper@....com>
Subject: [PATCH RFC net-next] inet: add ip_retry_random_port sysctl to reduce sequential port retries

With the current port selection algorithm, ports after a reserved port
or long time used port are used more often than others. This combines
with cloud environments blocking connections between the application
server and the database server if there was a previous connection with
the same source port. This leads to connectivity problems between
applications on cloud environments.

The situation is that a source tuple is usable again after being closed
for a maximum lifetime segment of two minutes while in the firewall it's
still noted as existing for 60 minutes or longer. So in case that the
port is reused for the same target tuple before the firewall cleans up,
the connection will fail due to firewall interference which itself will
reset the activity timeout in its own table. We understand the real
issue here is that these firewalls cannot cope with standards-compliant
port reuse. But this is a workaround for such situations and an
improvement on the distribution of ports selected.

The proposed solution is instead of incrementing the port number,
performing a re-selection of a new random port within the remaining
range. This solution is configured via sysctl new option
"net.ipv4.ip_retry_random_port".

The test run consists of two processes, a client and a server, and loops
connect to the server sending some bytes back. The results we got are
promising:

Executed test: Current algorithm
ephemeral port range: 9000-65499
simulated selections: 10000000
retries during simulation: 14197718
longest retry sequence: 5202

Executed test: Proposed modified algorithm
ephemeral port range: 9000-65499
simulated selections: 10000000
retries during simulation: 3976671
longest retry sequence: 12

In addition, on graphs generated we can observe that the distribution of
source ports is more even with the proposed patch.

Signed-off-by: Fernando Fernandez Mancera <fmancera@...e.de>
Tested-by: Thorsten Toepper <thorsten.toepper@....com>
---
 .../networking/net_cachelines/netns_ipv4_sysctl.rst        | 1 +
 include/net/netns/ipv4.h                                   | 1 +
 net/ipv4/inet_hashtables.c                                 | 7 ++++++-
 net/ipv4/sysctl_net_ipv4.c                                 | 7 +++++++
 4 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
index beaf1880a19b..c4041fdca01e 100644
--- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
+++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
@@ -47,6 +47,7 @@ u8                              sysctl_tcp_ecn
 u8                              sysctl_tcp_ecn_fallback
 u8                              sysctl_ip_default_ttl                                                                ip4_dst_hoplimit/ip_select_ttl
 u8                              sysctl_ip_no_pmtu_disc
+u8                              sysctl_ip_retry_random_port
 u8                              sysctl_ip_fwd_use_pmtu                       read_mostly                             ip_dst_mtu_maybe_forward/ip_skb_dst_mtu
 u8                              sysctl_ip_fwd_update_priority                                                        ip_forward
 u8                              sysctl_ip_nonlocal_bind
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 2dbd46fc4734..d04b07e7c935 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -156,6 +156,7 @@ struct netns_ipv4 {
 
 	u8 sysctl_ip_default_ttl;
 	u8 sysctl_ip_no_pmtu_disc;
+	u8 sysctl_ip_retry_random_port;
 	u8 sysctl_ip_fwd_update_priority;
 	u8 sysctl_ip_nonlocal_bind;
 	u8 sysctl_ip_autobind_reuse;
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index f5826ec4bcaa..f1c79a7d3fd3 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -1088,8 +1088,13 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
 	for (i = 0; i < remaining; i += step, port += step) {
 		if (unlikely(port >= high))
 			port -= remaining;
-		if (inet_is_local_reserved_port(net, port))
+		if (inet_is_local_reserved_port(net, port)) {
+			if (net->ipv4.sysctl_ip_retry_random_port) {
+				port = low + get_random_u32_below(remaining);
+				port = ((port & 1) == step) ? port : (port - 1);
+			}
 			continue;
+		}
 		head = &hinfo->bhash[inet_bhashfn(net, port,
 						  hinfo->bhash_size)];
 		rcu_read_lock();
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index a1a50a5c80dc..5eade7d9e4a2 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -822,6 +822,13 @@ static struct ctl_table ipv4_net_table[] = {
 		.mode		= 0644,
 		.proc_handler	= ipv4_local_port_range,
 	},
+	{
+		.procname	= "ip_retry_random_port",
+		.maxlen		= sizeof(u8),
+		.data		= &init_net.ipv4.sysctl_ip_retry_random_port,
+		.mode		= 0644,
+		.proc_handler	= proc_dou8vec_minmax,
+	},
 	{
 		.procname	= "ip_local_reserved_ports",
 		.data		= &init_net.ipv4.sysctl_local_reserved_ports,
-- 
2.52.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ