lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad8dec4c-d3e0-46c6-a943-c7f3c786c802@suse.de>
Date: Wed, 4 Feb 2026 17:25:18 +0100
From: Fernando Fernandez Mancera <fmancera@...e.de>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, horms@...nel.org, corbet@....net, ncardwell@...gle.com,
 kuniyu@...gle.com, dsahern@...nel.org, idosch@...dia.com,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 Thorsten Toepper <thorsten.toepper@....com>
Subject: Re: [PATCH RFC net-next] inet: add ip_retry_random_port sysctl to
 reduce sequential port retries

On 2/3/26 6:54 PM, Fernando Fernandez Mancera wrote:
> With the current port selection algorithm, ports after a reserved port
> or long time used port are used more often than others. This combines
> with cloud environments blocking connections between the application
> server and the database server if there was a previous connection with
> the same source port. This leads to connectivity problems between
> applications on cloud environments.
> 
> The situation is that a source tuple is usable again after being closed
> for a maximum lifetime segment of two minutes while in the firewall it's
> still noted as existing for 60 minutes or longer. So in case that the
> port is reused for the same target tuple before the firewall cleans up,
> the connection will fail due to firewall interference which itself will
> reset the activity timeout in its own table. We understand the real
> issue here is that these firewalls cannot cope with standards-compliant
> port reuse. But this is a workaround for such situations and an
> improvement on the distribution of ports selected.
> 
> The proposed solution is instead of incrementing the port number,
> performing a re-selection of a new random port within the remaining
> range. This solution is configured via sysctl new option
> "net.ipv4.ip_retry_random_port".
> 
> The test run consists of two processes, a client and a server, and loops
> connect to the server sending some bytes back. The results we got are
> promising:
> 
> Executed test: Current algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 14197718
> longest retry sequence: 5202
> 
> Executed test: Proposed modified algorithm
> ephemeral port range: 9000-65499
> simulated selections: 10000000
> retries during simulation: 3976671
> longest retry sequence: 12
> 
> In addition, on graphs generated we can observe that the distribution of
> source ports is more even with the proposed patch.
> 
> Signed-off-by: Fernando Fernandez Mancera <fmancera@...e.de>
> Tested-by: Thorsten Toepper <thorsten.toepper@....com>
> ---
>   .../networking/net_cachelines/netns_ipv4_sysctl.rst        | 1 +
>   include/net/netns/ipv4.h                                   | 1 +
>   net/ipv4/inet_hashtables.c                                 | 7 ++++++-
>   net/ipv4/sysctl_net_ipv4.c                                 | 7 +++++++
>   4 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
> index beaf1880a19b..c4041fdca01e 100644
> --- a/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
> +++ b/Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
> @@ -47,6 +47,7 @@ u8                              sysctl_tcp_ecn
>   u8                              sysctl_tcp_ecn_fallback
>   u8                              sysctl_ip_default_ttl                                                                ip4_dst_hoplimit/ip_select_ttl
>   u8                              sysctl_ip_no_pmtu_disc
> +u8                              sysctl_ip_retry_random_port
>   u8                              sysctl_ip_fwd_use_pmtu                       read_mostly                             ip_dst_mtu_maybe_forward/ip_skb_dst_mtu
>   u8                              sysctl_ip_fwd_update_priority                                                        ip_forward
>   u8                              sysctl_ip_nonlocal_bind
> diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
> index 2dbd46fc4734..d04b07e7c935 100644
> --- a/include/net/netns/ipv4.h
> +++ b/include/net/netns/ipv4.h
> @@ -156,6 +156,7 @@ struct netns_ipv4 {
>   
>   	u8 sysctl_ip_default_ttl;
>   	u8 sysctl_ip_no_pmtu_disc;
> +	u8 sysctl_ip_retry_random_port;
>   	u8 sysctl_ip_fwd_update_priority;
>   	u8 sysctl_ip_nonlocal_bind;
>   	u8 sysctl_ip_autobind_reuse;
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index f5826ec4bcaa..f1c79a7d3fd3 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -1088,8 +1088,13 @@ int __inet_hash_connect(struct inet_timewait_death_row *death_row,
>   	for (i = 0; i < remaining; i += step, port += step) {
>   		if (unlikely(port >= high))
>   			port -= remaining;
> -		if (inet_is_local_reserved_port(net, port))
> +		if (inet_is_local_reserved_port(net, port)) {
> +			if (net->ipv4.sysctl_ip_retry_random_port) {
> +				port = low + get_random_u32_below(remaining);
> +				port = ((port & 1) == step) ? port : (port - 1);

The AI bot did a good observation 
(https://netdev-ai.bots.linux.dev/ai-review.html?id=c1544ebc-4c9d-45c5-bce9-784764102912). 
I think this would be better as it will keep the random scan within the 
same parity when needed.

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index f1c79a7d3fd3..c9650079f9e5 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -1090,8 +1090,11 @@ int __inet_hash_connect(struct 
inet_timewait_death_row *death_row,
  			port -= remaining;
  		if (inet_is_local_reserved_port(net, port)) {
  			if (net->ipv4.sysctl_ip_retry_random_port) {
-				port = low + get_random_u32_below(remaining);
-				port = ((port & 1) == step) ? port : (port - 1);
+				u32 candidate = low + get_random_u32_below(remaining);
+
+				if (step == 2 && (candidate & 1) != (port & 1))
+					candidate++;
+				port = candidate;
  			}
  			continue;
  		}

> +			}
>   			continue;
> +		}
>   		head = &hinfo->bhash[inet_bhashfn(net, port,
>   						  hinfo->bhash_size)];
>   		rcu_read_lock();
> diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
> index a1a50a5c80dc..5eade7d9e4a2 100644
> --- a/net/ipv4/sysctl_net_ipv4.c
> +++ b/net/ipv4/sysctl_net_ipv4.c
> @@ -822,6 +822,13 @@ static struct ctl_table ipv4_net_table[] = {
>   		.mode		= 0644,
>   		.proc_handler	= ipv4_local_port_range,
>   	},
> +	{
> +		.procname	= "ip_retry_random_port",
> +		.maxlen		= sizeof(u8),
> +		.data		= &init_net.ipv4.sysctl_ip_retry_random_port,
> +		.mode		= 0644,
> +		.proc_handler	= proc_dou8vec_minmax,
> +	},
>   	{
>   		.procname	= "ip_local_reserved_ports",
>   		.data		= &init_net.ipv4.sysctl_local_reserved_ports,


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ