[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <3fc5b9be1d73417a99756404c0089814@skoda.cz>
Date: Fri, 8 Oct 2021 20:08:36 +0000
From: Strejc Cyril <cyril.strejc@...da.cz>
To: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: PROBLEM: multicast router does not fill UDP csum of its own forwarded
packets
Hi,
please let me summarize a problem regarding Linux multicast routing in combination with L4 checksum offloading and own (locally produced) multicast packets being forwarded.
* Application observation *
Multicast router does not fill-in UDP checksum into locally produced, looped-back and forwarded UDP datagrams, if an original output NIC the datagrams are sent to has UDP TX checksum offload enabled.
* Full description / User story *
I run an application which uses Linux multicast routing capabilities to send equal multicast UDP datagrams to multiple networks. The application sets IP_MULTICAST_IF and sends each datagram by a single write to a single socket. Properly configured Linux multicast routing in combination with a multicast loop-back ensures the datagrams are forwarded to other network interfaces.
If the outgoing IP_MULTICAST_IF interface has UDP TX checksum offload enabled, the csum is not calculated and is not filled into skb data by kernel. The NIC with TX csum offload calculates and fills csum during transmission, but does not modify skb data in RAM (at least both NICs I have tested).
Then, packet is looped back in ip_mc_finish_output() and dev_loopback_xmit(), where skb->ip_summed is set to CHECKSUM_UNNECESSARY. Since then, packet traverse the network stack with wrong (not filled in) L4 checksum, is forwarded to multicast routing output interfaces with CHECKSUM_UNNECESSARY and hence not correctly updated.
* Kernel info *
Tested: 5.4, 5.15-rc3
I do not know, when the problem was introduced, probably long time ago, maybe in 35fc92a9 ("[NET]: Allow forwarding of ip_summed except CHECKSUM_COMPLETE").
* NIC tested *
I've tested two drivers (NIC) with TX checksum offloading: e1000e in vanilla kernel and NXP's DPAA with out-of-vanilla-tree open-source drivers.
* Steps to reproduce *
It's possible to use shell, ip, smcroute and socat to reproduce the problem. Tested in Linux Mint with smcrouted shell wrapper.
# UCO_IF=eth0 # Interface with UDP TX Checksum Offload enabled.
# DST_IF=eth1
# ip addr add 192.168.1.1/24 dev $UCO_IF
# ip addr add 192.168.2.1/24 dev $DST_IF
# smcroute -d
# smcroute -a $UCO_IF 192.168.1.1 239.192.0.1 $DST_IF
# echo "check" | socat - UDP:239.192.0.1:9,ip-multicast-ttl=2,ip-multicast-if=192.168.1.1
The "check" datagram is sent with wrong UDP csum out of DST_IF. An other computer or physical wire loopback is needed to capture packet as it leaved the DST_IF.
* Workaround *
I use the attached patch as the workaround, not sure at all if it is correct in all cases.
I would be very pleased if anyone could think about a correct approach to the problem.
Thanks,
Cyril
View attachment "0001-net-multicast-calc-csum-of-looped-back-and-forwarded.patch" of type "text/x-patch" (1421 bytes)
Powered by blists - more mailing lists