[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <568FD689.3070300@solarflare.com>
Date: Fri, 8 Jan 2016 15:32:25 +0000
From: Edward Cree <ecree@...arflare.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>,
<linux-net-drivers@...arflare.com>,
Tom Herbert <tom@...bertland.com>
Subject: Re: [PATCH v2 net-next 1/5] net: local checksum offload for
encapsulation
On 07/01/16 22:53, Alexander Duyck wrote:
> On Thu, Jan 7, 2016 at 9:12 AM, Edward Cree <ecree@...arflare.com> wrote:
>> The arithmetic properties of the ones-complement checksum mean that a
>> correctly checksummed inner packet, including its checksum, has a ones
>> complement sum depending only on whatever value was used to initialise
>> the checksum field before checksumming (in the case of TCP and UDP,
>> this is the ones complement sum of the pseudo header, complemented).
>> Consequently, if we are going to offload the inner checksum with
>> CHECKSUM_PARTIAL, we can compute the outer checksum based only on the
>> packed data not covered by the inner checksum, and the initial value of
>> the inner checksum field.
>>
>> Signed-off-by: Edward Cree <ecree@...arflare.com>
>> ---
>> include/linux/skbuff.h | 26 ++++++++++++++++++++++++++
>> net/ipv4/ip_tunnel_core.c | 4 ++++
>> net/ipv4/udp.c | 29 ++++++++++-------------------
>> net/ipv6/ip6_checksum.c | 24 ++++++++----------------
>> 4 files changed, 48 insertions(+), 35 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 6b6bd42..6590d08 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -3665,5 +3665,31 @@ static inline unsigned int skb_gso_network_seglen(const struct sk_buff *skb)
>> return hdr_len + skb_gso_transport_seglen(skb);
>> }
>>
>> +/* Local Checksum Offload.
>> + * Compute outer checksum based on the assumption that the
>> + * inner checksum will be offloaded later.
>> + * See Documentation/networking/tx-offloads.txt for
>> + * explanation of how this works.
>> + * Fill in outer checksum adjustment (e.g. with sum of outer
>> + * pseudo-header) before calling.
>> + * Also ensure that inner checksum is in linear data area.
>> + */
>> +static inline __wsum lco_csum(struct sk_buff *skb)
>> +{
>> + char *inner_csum_field;
>> + __wsum csum;
>> +
>> + /* Start with complement of inner checksum adjustment */
>> + inner_csum_field = skb->data + skb_checksum_start_offset(skb) +
>> + skb->csum_offset;
> You would probably benefit from caching off the result of
> skb_checksum_start_offset into a local variable so the compiler
> doesn't go through and recompute it when you call it again below.
It's a nearly-trivial inline function; won't the compiler be smart enough to
cache that result itself?
>> + csum = ~csum_unfold(*(__force __sum16 *)inner_csum_field);
> This seems like a lot of work, couldn't you get away with just
> bit-flipping this and moving it into uh->check on the outer header?
It's not a lot of work: all this does is zero-extend to 32 bits and flip.
It looks like more, but most of it is just a cast; it's written in this way
to pacify sparse while using as little __force as possible.
lco_csum can't move it into uh->check, because it doesn't have uh. In fact,
the skb might not even be UDP - this function is intended to be used also
for GRE, which has an IP-style checksum but in a different place. (Next
version of the patch series will implement that btw)
>> + /* Add in checksum of our headers (incl. outer checksum
>> + * adjustment filled in by caller)
>> + */
>> + csum = skb_checksum(skb, 0, skb_checksum_start_offset(skb), csum);
>> + /* The result is the outer checksum */
>> + return csum;
>> +}
>> +
> The more I think about it I am not sure how much there is to be gained
> by having this as a separate function anyway since I think you might
> be able to better exploit things with a few changes to the ordering of
> operations. See my notes below in the IPv4 section.
>
>> #endif /* __KERNEL__ */
>> #endif /* _LINUX_SKBUFF_H */
>> diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
>> index 1db8418..f39b064 100644
>> --- a/net/ipv4/ip_tunnel_core.c
>> +++ b/net/ipv4/ip_tunnel_core.c
>> @@ -146,6 +146,10 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
>> }
>> EXPORT_SYMBOL_GPL(iptunnel_metadata_reply);
>>
>> +/* csum_help should only be ever true if the protocol doesn't support LCO.
>> + * If the tunnel uses udp_tunnel_xmit_skb(), then it gets LCO for free, and
>> + * should always set csum_help to false.
>> + */
>> struct sk_buff *iptunnel_handle_offloads(struct sk_buff *skb,
>> bool csum_help,
>> int gso_type_mask)
>> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
>> index 8841e98..c1c73be 100644
>> --- a/net/ipv4/udp.c
>> +++ b/net/ipv4/udp.c
>> @@ -767,32 +767,23 @@ void udp_set_csum(bool nocheck, struct sk_buff *skb,
>> {
>> struct udphdr *uh = udp_hdr(skb);
>>
>> - if (nocheck)
>> + if (nocheck) {
>> uh->check = 0;
>> - else if (skb_is_gso(skb))
>> + } else if (skb_is_gso(skb)) {
>> uh->check = ~udp_v4_check(len, saddr, daddr, 0);
>> - else if (skb_dst(skb) && skb_dst(skb)->dev &&
>> - (skb_dst(skb)->dev->features &
>> - (NETIF_F_IP_CSUM | NETIF_F_HW_CSUM))) {
>> -
>> - BUG_ON(skb->ip_summed == CHECKSUM_PARTIAL);
>> + } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
>> + __wsum csum;
>>
> I wonder if this shouldn't be made a check that is in addition to the
> two options below instead of completely replacing them. The question
> I would have is if there are any cases where we need to follow the
> path that results in the CHECKSUM_UNNECESSARY being set.
I don't think there can be such a case.
Either: we've already set up PARTIAL for an inner header, so we can
definitely do LCO.
Or: we haven't set up PARTIAL yet, so we can use that now. If the device
doesn't support it, it'll get fixed up later when we validate_xmit_skb().
So there's no way (AFAICT) that we'd ever not be able to use PARTIAL.
Unless - hmmm - what happens if we've set up PARTIAL for a CRC rather than
an IP checksum? However, it looks to me as if in that case the old code
would have screwed up when iptunnel_handle_offloads() would do the inner
csum in skb_checksum_help() and would do it as an IP checksum. So I'm
guessing this probably can't happen. Or it's already broken and so my
patch won't make it any worse ;)
However, the next version of the patch series will split this change out
from the rest of the patch, as Tom Herbert suggested.
>
>> + uh->check = ~udp_v4_check(len, saddr, daddr, 0);
>> + csum = lco_csum(skb);
>> + uh->check = csum_fold(csum);
>> + if (uh->check == 0)
>> + uh->check = CSUM_MANGLED_0;
>
> You would probably benefit from reordering this to something like what
> we have in the last block below this one. The idea is then you only
> halve to fold things once and can avoid some unnecessary duplication.
>
> So you could code it up with something like:
> __wsum csum;
> int start_offset;
>
> start_offset = skb_checksum_start_offset(skb);
> uh->check = ~(*(__sum16 *)(skb->data + start_offset + skb->csum_offset));
> csum = skb_checksum(skb, 0, start_offset, 0);
> uh->check = udp_v4_check(len, saddr, daddr, csum);
> if (uh->check == 0)
> uh->check = CSUM_MANGLED_0;
>
> Forgive the formatting, my email client mangles tabs badly. By using
> the pseudo header checksum from the inner header for the starting
> value and then computing the udp_v4_check for the outer header last
> you save yourself a few cycles since you only have to fold the
> checksum once instead of once for the pseudo-header and again for the
> final result.
Hmm, I think we can do this without losing the helper function (which will
be shared not just by UDPv4 and UDPv6 but also GRE).
Something like this:
uh->check = 0;
uh->check = ~udp_v4_check(len, saddr, daddr, lco_csum(skb));
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;
Now the only fold is the one udp_v4_check() does.
Would that shave off enough cycles to satisfy?
Powered by blists - more mailing lists