[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <0A676E07-BD16-492A-8C10-4FC541525F73@gmail.com>
Date: Thu, 31 Oct 2024 18:59:09 -0700
From: namniart@...il.com
To: Florian Westphal <fw@...len.de>
Cc: netdev@...r.kernel.org
Subject: Re: Duplicate invocation of NF_INET_POST_ROUTING rule for outbound multicast?
Thanks for the quick reply; I will work through our build process and try to get that tested in the next few days.
I was thinking the fix for this might be more substantial; call NF_HOOK without a callback at the top of ip_mc_output to determine the fate of the packet, and then make and loop back copies of the packet only if the packet passed the postrouting chain. That’d prevent the nf chain from being called multiple times for the exact same packet, could apply to multicast and RTCF_BROADCAST, and would solve the cgroup issue at the same time.
If you think that’s a useful approach, I am willing to write and test the patch.
-Austin
> On Oct 31, 2024, at 2:52 PM, Florian Westphal <fw@...len.de> wrote:
>
> Austin Hendrix <namniart@...il.com> wrote:
>> I've been staring at the linux source code for a while, and I think
>> this part of ip_mc_output explains it.
>>
>> if (sk_mc_loop(sk)
>> #ifdef CONFIG_IP_MROUTE
>> /* Small optimization: do not loopback not local frames,
>> which returned after forwarding; they will be dropped
>> by ip_mr_input in any case.
>> Note, that local frames are looped back to be delivered
>> to local recipients.
>>
>> This check is duplicated in ip_mr_input at the moment.
>> */
>> &&
>> ((rt->rt_flags & RTCF_LOCAL) ||
>> !(IPCB(skb)->flags & IPSKB_FORWARDED))
>> #endif
>> ) {
>> struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
>> if (newskb)
>> NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING,
>> net, sk, newskb, NULL, newskb->dev,
>> ip_mc_finish_output);
>> }
>>
>> It looks like ip_mc_output duplicates outgoing multicast, sends the
>> copy through POSTROUTING first (remember how the first copy didn't
>> have UID and GID?), and then loops that copy back for local multicast
>> listeners.
>>
>> I haven't followed all of the details yet, but it looks like the copy
>> that is looped back lacks the sk_buff attributes which identify the
>> UID, GID and cgroup of the sender.
>
> Yes, skb_clone'd skbs are not owned by any socket.
>
>> Is my understanding of this correct? Is the netdev team willing to
>> discuss possible solutions to this, or is this behavior "by design?"
>
> Its for historic reasons, this is very old and predates cgroups.
>
> You could try this (untested) patch, ipv6 would need similar treatment.
> We'd probably also want to extend this to RTCF_BROADCAST, i.e. add
> skb_clone_sk() or similar helper and then use that for these clones.
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -396,10 +396,16 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
> #endif
> ) {
> struct sk_buff *newskb = skb_clone(skb, GFP_ATOMIC);
> - if (newskb)
> + if (newskb) {
> + struct sock *skb_sk = skb->sk;
> +
> + if (skb_sk)
> + skb_set_owner_edemux(newskb, skb_sk);
> +
> NF_HOOK(NFPROTO_IPV4, NF_INET_POST_ROUTING,
> net, sk, newskb, NULL, newskb->dev,
> ip_mc_finish_output);
> + }
> }
>
> /* Multicasts with ttl 0 must not go beyond the host */
Powered by blists - more mailing lists