lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aLBE2Ee7pUBzUupH@calendula>
Date: Thu, 28 Aug 2025 14:00:24 +0200
From: Pablo Neira Ayuso <pablo@...filter.org>
To: Fabian Bläse <fabian@...ese.de>
Cc: netdev@...r.kernel.org, netfilter-devel@...r.kernel.org,
	"Jason A. Donenfeld" <Jason@...c4.com>,
	Florian Westphal <fw@...len.de>
Subject: Re: [PATCH v3] icmp: fix icmp_ndo_send address translation for reply
 direction

On Thu, Aug 28, 2025 at 11:14:35AM +0200, Fabian Bläse wrote:
> The icmp_ndo_send function was originally introduced to ensure proper
> rate limiting when icmp_send is called by a network device driver,
> where the packet's source address may have already been transformed
> by SNAT.
> 
> However, the original implementation only considers the
> IP_CT_DIR_ORIGINAL direction for SNAT and always replaced the packet's
> source address with that of the original-direction tuple. This causes
> two problems:
> 
> 1. For SNAT:
>    Reply-direction packets were incorrectly translated using the source
>    address of the CT original direction, even though no translation is
>    required.
> 
> 2. For DNAT:
>    Reply-direction packets were not handled at all. In DNAT, the original
>    direction's destination is translated. Therefore, in the reply
>    direction the source address must be set to the reply-direction
>    source, so rate limiting works as intended.
> 
> Fix this by using the connection direction to select the correct tuple
> for source address translation, and adjust the pre-checks to handle
> reply-direction packets in case of DNAT.
> 
> Additionally, wrap the `ct->status` access in READ_ONCE(). This avoids
> possible KCSAN reports about concurrent updates to `ct->status`.

I think such concurrent update cannot not happen, NAT bits are only
set for the first packet of a connection, which sets up the nat
configuration, so READ_ONCE() can go away.

Florian?

> Fixes: 0b41713b6066 ("icmp: introduce helper for nat'd source address in network device context")
> 
> Signed-off-by: Fabian Bläse <fabian@...ese.de>
> Cc: Jason A. Donenfeld <Jason@...c4.com>
> Cc: Florian Westphal <fw@...len.de>
> ---
> Changes v1->v2:
> - Implement fix for ICMPv6 as well
> 
> Changes v2->v3:
> - Collapse conditional tuple selection into a single direction lookup [Florian]
> - Always apply source address translation if IPS_NAT_MASK is set [Florian]
> - Wrap ct->status in READ_ONCE()
> - Add a clearer explanation of the behaviour change for DNAT
> ---
>  net/ipv4/icmp.c     | 6 ++++--
>  net/ipv6/ip6_icmp.c | 6 ++++--
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index 2ffe73ea644f..c48c572f024d 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -799,11 +799,12 @@ void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info)
>  	struct sk_buff *cloned_skb = NULL;
>  	struct ip_options opts = { 0 };
>  	enum ip_conntrack_info ctinfo;
> +	enum ip_conntrack_dir dir;
>  	struct nf_conn *ct;
>  	__be32 orig_ip;
>  
>  	ct = nf_ct_get(skb_in, &ctinfo);
> -	if (!ct || !(ct->status & IPS_SRC_NAT)) {
> +	if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) {
>  		__icmp_send(skb_in, type, code, info, &opts);
>  		return;
>  	}
> @@ -818,7 +819,8 @@ void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info)
>  		goto out;
>  
>  	orig_ip = ip_hdr(skb_in)->saddr;
> -	ip_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.ip;
> +	dir = CTINFO2DIR(ctinfo);
> +	ip_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.ip;
>  	__icmp_send(skb_in, type, code, info, &opts);
>  	ip_hdr(skb_in)->saddr = orig_ip;
>  out:
> diff --git a/net/ipv6/ip6_icmp.c b/net/ipv6/ip6_icmp.c
> index 9e3574880cb0..233914b63bdb 100644
> --- a/net/ipv6/ip6_icmp.c
> +++ b/net/ipv6/ip6_icmp.c
> @@ -54,11 +54,12 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info)
>  	struct inet6_skb_parm parm = { 0 };
>  	struct sk_buff *cloned_skb = NULL;
>  	enum ip_conntrack_info ctinfo;
> +	enum ip_conntrack_dir dir;
>  	struct in6_addr orig_ip;
>  	struct nf_conn *ct;
>  
>  	ct = nf_ct_get(skb_in, &ctinfo);
> -	if (!ct || !(ct->status & IPS_SRC_NAT)) {
> +	if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) {
>  		__icmpv6_send(skb_in, type, code, info, &parm);
>  		return;
>  	}
> @@ -73,7 +74,8 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info)
>  		goto out;
>  
>  	orig_ip = ipv6_hdr(skb_in)->saddr;
> -	ipv6_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.in6;
> +	dir = CTINFO2DIR(ctinfo);
> +	ipv6_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.in6;
>  	__icmpv6_send(skb_in, type, code, info, &parm);
>  	ipv6_hdr(skb_in)->saddr = orig_ip;
>  out:
> -- 
> 2.51.0
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ