netdev - Re: [RFC] l2tp: avoid checksum offload for fragmented packets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20130529171531.GB3179@raven>
Date:	Wed, 29 May 2013 18:15:32 +0100
From:	Tom Parkin <tparkin@...alix.com>
To:	Benjamin LaHaise <bcrl@...ck.org>
Cc:	netdev@...r.kernel.org, jchapman@...alix.com
Subject: Re: [RFC] l2tp: avoid checksum offload for fragmented packets

Thanks, Ben.

On Mon, May 27, 2013 at 02:58:29PM -0400, Benjamin LaHaise wrote:
> > This change modifies the L2TP xmit path to fallback to software checksum
> > calculation if the L2TP packet + IP header exceeds the tunnel device MTU.
> > Since we don't know what the IP header length will be a priori, we assume the
> > worst-case of 60b.  This will likely result in unnecessary software
> > checksumming when packet sizes approach the MTU since it's probably not common
> > to be using the full IP header.
> 
> Using the worst case value of 60 is a poor choice for many users of L2TP --
> plenty of the wholesale ISP services in the world using PPPoE transport 
> sessions to ISPs using frame with headers of ethernet(14) + IP(20) + UDP(8) + 
> L2TP(6) = 48 (this setup is used by a number of large telcos here in Canada).  
> This will results in spurious use of software checksumming over links that 
> are provisioned with the minimum usable MTU (which is common with this kind 
> of link).  Please make the code calculate the correct size of the added 
> headers to avoid uexpected CPU overhead.

Yes, I agree.  I wasn't sure whether direct calculation of the IP
header length would be acceptable, or whether there was another
mechanism available that I should be making use of.

I'll respin this patch with direct calculations rather than the worse-case
guess.

> > An alternative approach is to mimic UDP and use socket corking to allow us to
> > pass the skb to the IP layer prior to finally pushing the button on xmit.
> > This lets IP do his fragmentation before we authorise the packet send,
> > allowing us to check whether the packet was actually fragmented by IP or not.
> 
> That is probably undesirable from a CPU usage point of view.  Ideally, the 
> kernel's L2TP stack should generate ICMP frag needed messages for such 
> frames to avoid the fragmentation overhead (ipip is one such tunnelling 
> protocol that does this; there are others).

I agree.  That sounds like a better overall approach.  Perhaps we
could look at fixing up the immediate issue with a patch similar to
this (with your review comments resolved), and then add support for
ICMP frag needed messages as a further piece of work?

> > @@ -1197,30 +1224,14 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len
> >  		uh->check = 0;
> >  
> >  		/* Calculate UDP checksum if configured to do so */
> > +		if (sk->sk_no_check == UDP_CSUM_NOXMIT)
> > +			skb->ip_summed = CHECKSUM_NONE;
> >  #if IS_ENABLED(CONFIG_IPV6)
> > -		if (sk->sk_family == PF_INET6)
> > +		else if (sk->sk_family == PF_INET6)
> >  			l2tp_xmit_ipv6_csum(sk, skb, udp_len);
> > -		else
> ...
> 
> The last time I checked, for IPv6 UDP packets, the checksum MUST always be 
> calculated (RFC 2460).  If this has changed, you'll also need to update the 
> IPv6 UDP receive path to allow rx packets with a zero checksum, as I believe 
> they are noisily dropped at present.

Good catch, I'll fix that by reverting this chunk.

-- 
Tom Parkin
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

Download attachment "signature.asc" of type "application/pgp-signature" (491 bytes)