lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76nzfqbnb7dfbzrezpaeudtdzub7l26v6fdubbif6quu3hyvcv@gfhmjdh64r2c>
Date: Fri, 3 Oct 2025 15:01:16 +0000
From: Jordan Rife <jrife@...gle.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org, 
	Yusuke Suzuki <yusuke.suzuki@...valent.com>, Julian Wiedmann <jwi@...valent.com>, 
	Martin KaFai Lau <martin.lau@...nel.org>, Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH bpf] bpf: Fix metadata_dst leak
 __bpf_redirect_neigh_v{4,6}

On Fri, Oct 03, 2025 at 09:34:18AM +0200, Daniel Borkmann wrote:
> Cilium has a BPF egress gateway feature which forces outgoing K8s Pod
> traffic to pass through dedicated egress gateways which then SNAT the
> traffic in order to interact with stable IPs outside the cluster.
> 
> The traffic is directed to the gateway via vxlan tunnel in collect md
> mode. A recent BPF change utilized the bpf_redirect_neigh() helper to
> forward packets after the arrival and decap on vxlan, which turned out
> over time that the kmalloc-256 slab usage in kernel was ever-increasing.
> 
> The issue was that vxlan allocates the metadata_dst object and attaches
> it through a fake dst entry to the skb. The latter was never released
> though given bpf_redirect_neigh() was merely setting the new dst entry
> via skb_dst_set() without dropping an existing one first.
> 
> Fixes: b4ab31414970 ("bpf: Add redirect_neigh helper as redirect drop-in")
> Reported-by: Yusuke Suzuki <yusuke.suzuki@...valent.com>
> Reported-by: Julian Wiedmann <jwi@...valent.com>
> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> Cc: Martin KaFai Lau <martin.lau@...nel.org>
> Cc: Jakub Kicinski <kuba@...nel.org>
> Cc: Jordan Rife <jrife@...gle.com>
> ---
>  net/core/filter.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index b005363f482c..c3c0b5a37504 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2281,6 +2281,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
>  		if (IS_ERR(dst))
>  			goto out_drop;
>  
> +		skb_dst_drop(skb);
>  		skb_dst_set(skb, dst);
>  	} else if (nh->nh_family != AF_INET6) {
>  		goto out_drop;
> @@ -2389,6 +2390,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
>  			goto out_drop;
>  		}
>  
> +		skb_dst_drop(skb);
>  		skb_dst_set(skb, &rt->dst);
>  	}
>  
> -- 
> 2.43.0
>

Nice catch!

Reviewed-by: Jordan Rife <jrife@...gle.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ