[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4411734.UPlyArG6xL@benoit.monin>
Date: Thu, 10 Oct 2024 18:02:39 +0200
From: Benoît Monin <benoit.monin@....fr>
To: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Jiri Pirko <jiri@...nulli.us>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>, netdev@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject:
Re: [PATCH net-next] net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header
contains extension
07/10/2024 Willem de Bruijn wrote:
> Benoît Monin wrote:
> > 07/10/2024 Willem de Bruijn wrote :
> > > Benoît Monin wrote:
> > > > Devices with NETIF_F_IP_CSUM capability can checksum TCP and UDP over
> > > > IPv4 with an IP header that may contains options; whereas devices with
> > > > NETIF_F_IPV6_CSUM capability can only checksum TCP and UDP over IPv6 if
> > > > the IP header does not contains extension.
> > >
> > > Are both these statements universally true across devices?
> > >
> > > I can believe for NETIF_F_IP_CSUM that this is the definition, and
> > > that devices that cannot handle options must fix it up indivually in
> > > ndo_features_check.
> > >
> > > And same for NETIF_F_IPV6_CSUM with extension headers.
> > >
> > > But it would be good to see where this is asserted in the code, or
> > > examples of drivers that have to perform such actions.
> > >
> > I was referring to the documentation in skbuff.h that describes
> > NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM.
>
> Excellent. I had missed that when trying to find the source.
> Can you add that pointer to the commit message?
>
Agreed, I will add that to the commit message.
> > For NETIF_F_IPV6_CSUM, at least fsl_dpa and r8169 expect
> > ipv6_hdr(skb)->nexthdr to be IPPROTO_{TCP,UDP} to compute the correct
> > checksum for IPv6.
> >
> > I posted more details about the problem I am trying to fix with this
> > patch in the following thread:
> > https://lore.kernel.org/netdev/26548921.1r3eYUQgxm@benoit.monin/T/#u
>
> And I had missed this thread entirely too. It's very helpful.
>
> Please add a Link: tag that refers to it.
>
Ok.
> > > > Enforce that in skb_csum_hwoffload_help by checking the network header
> > > > length in the case where the IP header version is 6. We cannot simply
> > > > rely on the network header length since the IPv4 header can from 20 to
> > > > 60 bytes whereas the IPv6 header must be 40 bytes. So we check the
> > > > version field which is common to IPv4 and IPv6 headers.
> > > >
> > > > This fixes checksumming errors seen with ip6_tunnel and fou6
> > > > encapsulation, for example with GRE-in-UDP over IPv6:
> > > > * fou6 adds a UDP header with a partial checksum if the inner packet
> > > > does not contains a valid checksum.
> > >
> > > Where in the code is this conditional on the inner packet csum?
> > >
> > This is done by udp6_set_csum, which called by fou6_build_udp.
>
> else if (skb->ip_summed == CHECKSUM_PARTIAL) {
> uh->check = 0;
> uh->check = udp_v6_check(len, saddr, daddr, lco_csum(skb));
> if (uh->check == 0)
> uh->check = CSUM_MANGLED_0;
> } else {
> skb->ip_summed = CHECKSUM_PARTIAL;
> skb->csum_start = skb_transport_header(skb) - skb->head;
> skb->csum_offset = offsetof(struct udphdr, check);
> uh->check = ~udp_v6_check(len, saddr, daddr, 0);
> }
>
> It either leaves the inner header as CHECKSUM_PARTIAL, and fills in
> the outer header entirely, based on knowledge that the inner header
> will add up to zero (local checksum offload).
>
> Or it assumes that the inner header is already filled in (whether
> computed or CSUM_MANGLED_0) and then uses CHECKSUM_PARTIAL offloading
> for the outer packet.
>
> The issue you are seeing is because CHECKSUM_PARTIAL with
> NETIF_F_IPV6_CSUM does not work if extension headers are present. If
> so, it should even affect non-tunneled packets. I think this reference
> to a dependency on the state of an inner checksum adds confusion only.
> Unless I'm missing something.
>
I also forgot to mention that the problem only affects encapsulated packets.
UDP packets emitted from user-space to an IPv6 address go through
ip6_make_skb(), which calls __ip6_append_data() where a check is done on the
header size before setting CHECKSUM_PARTIAL:
/* CHECKSUM_PARTIAL only with no extension headers and when
* we are not going to fragment
*/
if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
headersize == sizeof(struct ipv6hdr) &&
length <= mtu - headersize &&
(!(flags & MSG_MORE) || cork->gso_size) &&
rt->dst.dev->features & (NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM))
csummode = CHECKSUM_PARTIAL;
This is done before calling ip6_local_out(). No such check is done by
ip6_tnl_xmit() before it calls ip6_local_out(), because there is no reason to
check for device offload feature at this point. It is assumed that it will be
verified by validate_xmit_skb().
I will add this explanation to the commit message.
> > > > * ip6_tunnel adds an IPv6 header with a destination option extension
> > > > header if encap_limit is non-zero (the default value is 4).
> > >
> > >
> > > If this is a fix, we'll need to target net and best effort find a
> > > suitable fixes tag.
> > >
> > I guess the particular problem I have found is present since the merge
> > of fou6 in 4.7, but it might not be the only code path to create an
> > IPv6 packet with an extension header and a partial checksum.
>
> True. The fix as is won't be backportable before commit 62fafcd63139
> ("net: support ip generic csum processing in skb_csum_hwoffload_help")
> in v5.12 anyway.
>
> Maybe use that as the Fixes tag, but add a comment in the tag that it
> did not introduce the bug, but the fix depends on that logic.
>
Ok, I will add a Fixes tag.
> > > > Signed-off-by: Benoît Monin <benoit.monin@....fr>
> > > > ---
> > > > net/core/dev.c | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index ea5fbcd133ae..199831d86ec1 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -3639,6 +3639,9 @@ int skb_csum_hwoffload_help(struct sk_buff *skb,
> > > > return 0;
> > > >
> > > > if (features & (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM)) {
> > > > + if (ip_hdr(skb)->version == 6 &&
> > > > + skb_network_header_len(skb) != sizeof(struct ipv6hdr))
> > > > + goto sw_checksum;
>
> This check depends on skb->transport_header and skb->network_header
> being set. This is likely true for all CHECKSUM_PARTIAL packets that
> originate in the local stack. As well as for the injected packets and
> forwarded packets, as far as I see, so Ack.
>
> Access to the network header at this point likely requires
> skb_header_pointer, however. As also used in qdisc_pkt_len_init called
> from the same __dev_queue_xmit_nit.
>
> Perhaps this test should be in can_checksum_protocol, which already
> checks that the packet is IPv6 when testing NETIF_F_IPV6_CSUM.
>
You're right, moving this to can_checksum_protocol() makes more sense. I will
do that, retest and post a new version of the patch.
> > > > switch (skb->csum_offset) {
> > > > case offsetof(struct tcphdr, check):
> > > > case offsetof(struct udphdr, check):
> > > > @@ -3646,6 +3649,7 @@ int skb_csum_hwoffload_help(struct sk_buff *skb,
> > > > }
> > > > }
> > > >
> > > > +sw_checksum:
> > > > return skb_checksum_help(skb);
> > > > }
> > > > EXPORT_SYMBOL(skb_csum_hwoffload_help);
> > >
> >
>
Thanks for your thorough review!
--
Benoît
Powered by blists - more mailing lists