[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57903a5f-5033-4da9-8807-0b6f894827a9@huawei.com>
Date: Thu, 31 Jan 2019 10:58:56 +0800
From: maowenan <maowenan@...wei.com>
To: Tom Herbert <tom@...bertland.com>
CC: Linux Kernel Network Developers <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next] net: udp Allow CHECKSUM_UNNECESSARY packets to
do GRO.
On 2019/1/31 10:43, Tom Herbert wrote:
> On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@...wei.com> wrote:
>>
>>
>>
>> On 2019/1/30 4:24, Tom Herbert wrote:
>>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@...wei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2019/1/29 14:24, Tom Herbert wrote:
>>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@...wei.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
>>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@...wei.com> wrote:
>>>>>>>>
>>>>>>>> Hi all,
>>>>>>>> Do you have any comments about this change?
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>>>>>> skb->csum_level = 0;
>>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>>>>>> It is not our expect, because check=0 in udp header indicates this
>>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>>>>>
>>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>>>>>> ---
>>>>>>>>> include/linux/netdevice.h | 1 +
>>>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>>>
>>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>>>>> index 1377d08..9c819f1 100644
>>>>>>>>> --- a/include/linux/netdevice.h
>>>>>>>>> +++ b/include/linux/netdevice.h
>>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>>>>> * during GRO. This saves work if we fallback to normal path.
>>>>>>>>> */
>>>>>>>>> __skb_incr_checksum_unnecessary(skb);
>>>>>>>>> + NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>>>>>
>>>>>>> That doesn't look right. This would be reinitializing the GRO
>>>>>>> checksums from the beginning.
>>>>>>>
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> I assume the code is bailing on this conditional:
>>>>>>>
>>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>> (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>>> NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>> !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>> !udp_sk(sk)->gro_receive)
>>>>>>> goto out_unlock;
>>>>>>>
>>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
>>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>>>>>> received.
>>>>>>
>>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>>>>>
>>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>>>>>
>>>>> Yes, but the csum_level is changing since we've gone beyond the
>>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
>>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
>>>>>
>>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
>>>>> checksum to be verified. The idea is that if we end up computing
>>>>> packet checksums on the host for inner checksums like TCP during GRO,
>>>>> then that's negating the performance benefits of GRO. Had UDP check
>>>>> not been zero then we would do checksum unnecessary conversion and so
>>>>> csum_valid would be set for the remainded of GRO processing. The
>>>>> existing code is following the rule I believe, so this may be working
>>>>> as intended.
>>>>
>>>> Do you have any suggestion if I need do GRO as udp->check is zero?
>>>> My previous modification which works fine as below:
>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>> (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>> + skb->ip_summed != CHECKSUM_UNNECESSARY &&
>>>
>>> That's effectively disabling the rule that we need a real checksum
>>> calculation to proceed with GRO. Besides that, the device returning
>>> one checksum-unnecessary level because UDP csum is zero is pretty
>>> pointelss; we can just as easily deduce get to same state just by
>>> looking at the field with CHECKSUM_NONE. What we really want to see
>>> for GRO is a real checksum computation being done on the packet.
>>>
>>> A few questions:
>>>
>>> What type of packets are being GROed? Are these TCP? What performance
>>> difference do you see with our patch? Can you try enabling UDP
>>> checksums, and even RCO with VXLAN? With UDP encapsulation we
>>> generally see better performance with checksum enabled since UDP
>>> checksum offload is ubiquitous and we can easily convert
>>> checksum-unnecessary (with non-zero csum) to checksum-complete.
>>
>> We use the physical network card calculate the checksum of the inner packet with checksum offload.
>> Set the udp checksum of the vxlan header is 0.
>>
> I see. It sounds like the device is really verifying two checksums in
> the packet, the outer UDP checksum (which is zero for UDP) and an
> inner checksum, but only reporting one checksum was verified. The
> driver needs to set csum_level to 1 in this case (meaning two
> checksums have been verified for checksum-unnecessary). What NIC are
> you using?
Currently it is 82599, whose driver can't recognize the vxlan packet.
I guess so many NICs can't do this checking in it's driver, so I think this is a
common case, will we fix it in stack?
>
> Tom
>
>
>> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
>>
>>>
>>> Tom
>>>
>>>
>>>
>>>
>>>> NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>> !NAPI_GRO_CB(skb)->csum_valid) ||
>>>> !udp_sk(sk)->gro_receive)
>>>> goto out_unlock;
>>>>
>>>>
>>>>>
>>>>> Tom
>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
>
> .
>
Powered by blists - more mailing lists