[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76nzfqbnb7dfbzrezpaeudtdzub7l26v6fdubbif6quu3hyvcv@gfhmjdh64r2c>
Date: Fri, 3 Oct 2025 15:01:16 +0000
From: Jordan Rife <jrife@...gle.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Yusuke Suzuki <yusuke.suzuki@...valent.com>, Julian Wiedmann <jwi@...valent.com>,
Martin KaFai Lau <martin.lau@...nel.org>, Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH bpf] bpf: Fix metadata_dst leak
__bpf_redirect_neigh_v{4,6}
On Fri, Oct 03, 2025 at 09:34:18AM +0200, Daniel Borkmann wrote:
> Cilium has a BPF egress gateway feature which forces outgoing K8s Pod
> traffic to pass through dedicated egress gateways which then SNAT the
> traffic in order to interact with stable IPs outside the cluster.
>
> The traffic is directed to the gateway via vxlan tunnel in collect md
> mode. A recent BPF change utilized the bpf_redirect_neigh() helper to
> forward packets after the arrival and decap on vxlan, which turned out
> over time that the kmalloc-256 slab usage in kernel was ever-increasing.
>
> The issue was that vxlan allocates the metadata_dst object and attaches
> it through a fake dst entry to the skb. The latter was never released
> though given bpf_redirect_neigh() was merely setting the new dst entry
> via skb_dst_set() without dropping an existing one first.
>
> Fixes: b4ab31414970 ("bpf: Add redirect_neigh helper as redirect drop-in")
> Reported-by: Yusuke Suzuki <yusuke.suzuki@...valent.com>
> Reported-by: Julian Wiedmann <jwi@...valent.com>
> Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
> Cc: Martin KaFai Lau <martin.lau@...nel.org>
> Cc: Jakub Kicinski <kuba@...nel.org>
> Cc: Jordan Rife <jrife@...gle.com>
> ---
> net/core/filter.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index b005363f482c..c3c0b5a37504 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2281,6 +2281,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev,
> if (IS_ERR(dst))
> goto out_drop;
>
> + skb_dst_drop(skb);
> skb_dst_set(skb, dst);
> } else if (nh->nh_family != AF_INET6) {
> goto out_drop;
> @@ -2389,6 +2390,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev,
> goto out_drop;
> }
>
> + skb_dst_drop(skb);
> skb_dst_set(skb, &rt->dst);
> }
>
> --
> 2.43.0
>
Nice catch!
Reviewed-by: Jordan Rife <jrife@...gle.com>
Powered by blists - more mailing lists