[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S35waJuT4WrVnCVh03E8XfPQh64jXWTwWa2Ezd-wNwsRfA@mail.gmail.com>
Date: Tue, 22 Mar 2016 15:09:47 -0700
From: Tom Herbert <tom@...bertland.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>,
Kernel Team <kernel-team@...com>,
Alex Duyck <aduyck@...antis.com>
Subject: Re: [PATCH net-next] net: Fix remote checksum offload with GUE
On Tue, Mar 22, 2016 at 2:45 PM, Alexander Duyck
<alexander.duyck@...il.com> wrote:
> On Tue, Mar 22, 2016 at 2:10 PM, Tom Herbert <tom@...bertland.com> wrote:
>> On Tue, Mar 22, 2016 at 1:20 PM, Alexander Duyck
>> <alexander.duyck@...il.com> wrote:
>>> On Tue, Mar 22, 2016 at 12:19 PM, Tom Herbert <tom@...bertland.com> wrote:
>>>> In skb_segment the check of whether or not to perform the checksum on
>>>> host was changed to not consider rather remote checksum offload is
>>>> in use. In the case that can_checksum_protocol fails the checksum
>>>> is computed regardless. __skb_udp_tunnel_segment was modified in a
>>>> related patch to add NETIF_F_HW_CSUM into features when grabbing
>>>> the enc_features and remote checksum offload is being done. The
>>>> problem is that this bit can be cleared in lower GSO layers that
>>>> are also doing tunneling (e.g. ipip, GRE when used with GUE),
>>>> so when we get to skb_segment that intent has been lost and
>>>> can_checksum_protocol fails.
>>>
>>> So what you are describing sounds like a tunnel in tunnel scenario.
>>> It might work better to just skip masking the features if
>>> skb->remcsum_offload is set rather than trying to change how we
>>> perform the offload.
>>>
>> To be clear, my patch is restoring the old behavior not implementing a new one.
>
> Yes, but the old behavior could lead to kernel panics under certain
> circumstances.
>
>>> I'm pretty sure this will cause data corruption and maybe a kernel
>>> panic if Tx checksum offload is disabled.
>>>
>> Nope, working fine for me.
>
> What are the options you used to create the tunnel? Did you enable
> both remcsum and udpcsum?
>
modprobe fou
./ip fou add port 6080 gue
./ip link add name tun1 type ipip remote 10.1.1.2 local 10.1.1.2 ttl
225 encap gue encap-sport auto encap-dport 6080 encap-csum
encap-remcsum
ifconfig tun1 192.168.1.1
ip route add 192.168.1.0/24 dev tun1
Thanks,
Tom
>> The outer checksum would be computed in
>>>> This patch:
>>>>
>>>> - Restores the check in skb_segment to look at remote checksum offload.
>>>> - Removes the code in __skb_udp_tunnel_segment to force the
>>>> NETIF_F_HW_CSUM feature since this is no longer useful with above
>>>> change.
>>>> - Removes check for remote checksum offload in gso_reset_checksum.
>>>> A special case should not be needed here.
>>>>
>>>> Tested: Single netperf STREAM over GUE-ipip
>>>>
>>>> Before fix:
>>>> 5625 Mbps
>>>> After fix:
>>>> 6410 Mbps
>>>>
>>>> Fixes: 76443456227097179c1482 ("net: Move GSO csum into SKB_GSO_CB")
>>>> Signed-off-by: Tom Herbert <tom@...bertland.com>
>>>> ---
>>>> include/linux/skbuff.h | 4 ----
>>>> net/core/skbuff.c | 5 ++---
>>>> net/ipv4/udp_offload.c | 10 ----------
>>>> 3 files changed, 2 insertions(+), 17 deletions(-)
>>>>
>>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>>> index 15d0df9..f6fe8a5 100644
>>>> --- a/include/linux/skbuff.h
>>>> +++ b/include/linux/skbuff.h
>>>> @@ -3615,10 +3615,6 @@ static inline int gso_pskb_expand_head(struct sk_buff *skb, int extra)
>>>>
>>>> static inline void gso_reset_checksum(struct sk_buff *skb, __wsum res)
>>>> {
>>>> - /* Do not update partial checksums if remote checksum is enabled. */
>>>> - if (skb->remcsum_offload)
>>>> - return;
>>>> -
>>>> SKB_GSO_CB(skb)->csum = res;
>>>> SKB_GSO_CB(skb)->csum_start = skb_checksum_start(skb) - skb->head;
>>>> }
>>>
>>> I'm pretty sure this part here will break things when you don't have
>>> an outer offload enabled. What NIC did you test this on? Did you try
>>> disabling the Tx checksum support in the hardware to see what would
>>> happen?
>>>
>> Mellanox.
>>
>> When TX checksum is disabled the outer checksum is computed in
>> __skb_udp_tunnel_segment.
>
> Yes, but that uses gso_make_checksum which requires the csum and
> csum_start fields be initialized in the SKB_GSO_CB().
>
>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>> index f044f97..e4eb78d 100644
>>>> --- a/net/core/skbuff.c
>>>> +++ b/net/core/skbuff.c
>>>> @@ -3259,14 +3259,13 @@ skip_fraglist:
>>>> nskb->truesize += nskb->data_len;
>>>>
>>>> perform_csum_check:
>>>> - if (!csum) {
>>>> + if (!csum && !nskb->remcsum_offload) {
>>>> if (skb_has_shared_frag(nskb)) {
>>>> err = __skb_linearize(nskb);
>>>> if (err)
>>>> goto err;
>>>> }
>>>> - if (!nskb->remcsum_offload)
>>>> - nskb->ip_summed = CHECKSUM_NONE;
>>>> + nskb->ip_summed = CHECKSUM_NONE;
>>>> SKB_GSO_CB(nskb)->csum =
>>>> skb_checksum(nskb, doffset,
>>>> nskb->len - doffset, 0);
>>>
>>> I'm pretty sure this is going to cause a huge mess if you are
>>> requesting remote checksum but cannot perform an outer checksum. One
>>> of the reasons I merged these features together the way I did was
>>> because we needed to perform the initial checksum to avoid causing a
>>> kernel panic later on. Otherwise we don't have the SKB_GSO_CB()->csum
>>> and SKB_GSO_CB()->csum_start fields populated.
>>>
>> Nope, if we can't perform outer checksum it is done in the host. I'm
>> not seeing any problem.
>
> Are you running remcsum with udpcsum set or is udpcsum not set? If
> you don't set udpcsum it doesn't actually compute an outer checksum
> value.
>
>>>> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
>>>> index 8a3405a..f86f1e1 100644
>>>> --- a/net/ipv4/udp_offload.c
>>>> +++ b/net/ipv4/udp_offload.c
>>>> @@ -78,16 +78,6 @@ static struct sk_buff *__skb_udp_tunnel_segment(struct sk_buff *skb,
>>>>
>>>> features &= skb->dev->hw_enc_features;
>>>>
>>>> - /* The only checksum offload we care about from here on out is the
>>>> - * outer one so strip the existing checksum feature flags and
>>>> - * instead set the flag based on our outer checksum offload value.
>>>> - */
>>>> - if (remcsum || ufo) {
>>>> - features &= ~NETIF_F_CSUM_MASK;
>>>> - if (!need_csum || offload_csum)
>>>> - features |= NETIF_F_HW_CSUM;
>>>> - }
>>>> -
>>>> /* segment inner packet. */
>>>> segs = gso_inner_segment(skb, features);
>>>> if (IS_ERR_OR_NULL(segs)) {
>>>
>>> This part breaks UDP fragmentation I am pretty sure. We need this bit
>>> for UFO to avoid having to perform a checksum over the data twice if
>>> we are offloading the outer UDP checksum.
>>
>> The maybe leave this for ufo if it is a problem (please test your
>> suppositions!), but this is not good for remcsum offload.
>
> I suspect we are supposed to be setting encap_level to 0 when we set
> skb->encapsulation to 0 before processing inner headers in the case of
> UDP and GRE tunnel offloads. I'll try to test and submit something in
> the next hour or two.
>
> - Alex
Powered by blists - more mailing lists