[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241031215243.GA4460@breakpoint.cc>
Date: Thu, 31 Oct 2024 22:52:43 +0100
From: Florian Westphal <fw@...len.de>
To: Austin Hendrix <namniart@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: Duplicate invocation of NF_INET_POST_ROUTING rule for outbound
multicast?
Austin Hendrix <namniart@...il.com> wrote:
> I've been staring at the linux source code for a while, and I think
> this part of ip_mc_output explains it.
>
> if (sk_mc_loop(sk)
> #ifdef CONFIG_IP_MROUTE
> /* Small optimization: do not loopback not local frames,
> which returned after forwarding; they will be dropped
> by ip_mr_input in any case.
> Note, that local frames are looped back to be delivered
> to local recipients.
>
> This check is duplicated in ip_mr_input at the moment.
> */
> &&
> ((rt->rt_flags & RTCF_LOCAL) ||
> !(IPCB(skb)->flags & IPSKB_FORWARDED))
> #endif
> ) {
> struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
> if (newskb)
> NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING,
> net, sk, newskb, NULL, newskb->dev,
> ip_mc_finish_output);
> }
>
> It looks like ip_mc_output duplicates outgoing multicast, sends the
> copy through POSTROUTING first (remember how the first copy didn't
> have UID and GID?), and then loops that copy back for local multicast
> listeners.
>
> I haven't followed all of the details yet, but it looks like the copy
> that is looped back lacks the sk_buff attributes which identify the
> UID, GID and cgroup of the sender.
Yes, skb_clone'd skbs are not owned by any socket.
> Is my understanding of this correct? Is the netdev team willing to
> discuss possible solutions to this, or is this behavior "by design?"
Its for historic reasons, this is very old and predates cgroups.
You could try this (untested) patch, ipv6 would need similar treatment.
We'd probably also want to extend this to RTCF_BROADCAST, i.e. add
skb_clone_sk() or similar helper and then use that for these clones.
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -396,10 +396,16 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
#endif
) {
struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
- if (newskb)
+ if (newskb) {
+ struct sock *skb_sk = skb->sk;
+
+ if (skb_sk)
+ skb_set_owner_edemux(newskb, skb_sk);
+
NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING,
net, sk, newskb, NULL, newskb->dev,
ip_mc_finish_output);
+ }
}
/* Multicasts with ttl 0 must not go beyond the host */
Powered by blists - more mailing lists