lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 20 Feb 2019 16:32:54 +0800
From:   maowenan <maowenan@...wei.com>
To:     Tom Herbert <tom@...bertland.com>
CC:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next] net: udp Allow CHECKSUM_UNNECESSARY packets to
 do GRO.



On 2019/1/31 12:33, Tom Herbert wrote:
> On Wed, Jan 30, 2019 at 6:59 PM maowenan <maowenan@...wei.com> wrote:
>>
>>
>>
>> On 2019/1/31 10:43, Tom Herbert wrote:
>>> On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@...wei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2019/1/30 4:24, Tom Herbert wrote:
>>>>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@...wei.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2019/1/29 14:24, Tom Herbert wrote:
>>>>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@...wei.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
>>>>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@...wei.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>> Do you have any comments about this change?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
>>>>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
>>>>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
>>>>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
>>>>>>>>>>> skb->csum_level = 0;
>>>>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
>>>>>>>>>>> It is not our expect,  because check=0 in udp header indicates this
>>>>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
>>>>>>>>>>>
>>>>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
>>>>>>>>>>> ---
>>>>>>>>>>>  include/linux/netdevice.h | 1 +
>>>>>>>>>>>  1 file changed, 1 insertion(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>>>>>>>>>> index 1377d08..9c819f1 100644
>>>>>>>>>>> --- a/include/linux/netdevice.h
>>>>>>>>>>> +++ b/include/linux/netdevice.h
>>>>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
>>>>>>>>>>>                * during GRO. This saves work if we fallback to normal path.
>>>>>>>>>>>                */
>>>>>>>>>>>               __skb_incr_checksum_unnecessary(skb);
>>>>>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
>>>>>>>>>
>>>>>>>>> That doesn't look right. This would be reinitializing the GRO
>>>>>>>>> checksums from the beginning.
>>>>>>>>>
>>>>>>>>>>>       }
>>>>>>>>>>>  }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I assume the code is bailing on this conditional:
>>>>>>>>>
>>>>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>>>>             !udp_sk(sk)->gro_receive)
>>>>>>>>>                 goto out_unlock;
>>>>>>>>>
>>>>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
>>>>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
>>>>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
>>>>>>>>> received.
>>>>>>>>
>>>>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
>>>>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
>>>>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
>>>>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
>>>>>>>>
>>>>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
>>>>>>>>
>>>>>>> Yes, but the csum_level is changing since we've gone beyond the
>>>>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
>>>>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
>>>>>>>
>>>>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
>>>>>>> checksum to be verified. The idea is that if we end up computing
>>>>>>> packet checksums on the host for inner checksums like TCP during GRO,
>>>>>>> then that's negating the performance benefits of GRO. Had UDP check
>>>>>>> not been zero then we would do checksum unnecessary conversion and so
>>>>>>> csum_valid would be set for the remainded of GRO processing. The
>>>>>>> existing code is following the rule I believe, so this may be working
>>>>>>> as intended.
>>>>>>
>>>>>> Do you have any suggestion if I need do GRO as udp->check is zero?
>>>>>> My previous modification which works fine as below:
>>>>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
>>>>>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
>>>>>
>>>>> That's effectively disabling the rule that we need a real checksum
>>>>> calculation to proceed with GRO. Besides that, the device returning
>>>>> one checksum-unnecessary level because UDP csum is zero is pretty
>>>>> pointelss; we can just as easily deduce get to same state just by
>>>>> looking at the field with CHECKSUM_NONE. What we really want to see
>>>>> for GRO is a real checksum computation being done on the packet.
>>>>>
>>>>> A few questions:
>>>>>
>>>>> What type of packets are being GROed? Are these TCP? What performance
>>>>> difference do you see with our patch? Can you try enabling UDP
>>>>> checksums, and even RCO with VXLAN? With UDP encapsulation we
>>>>> generally see better performance with checksum enabled since UDP
>>>>> checksum offload is ubiquitous and we can easily convert
>>>>> checksum-unnecessary (with non-zero csum) to checksum-complete.
>>>>
>>>> We use the physical network card calculate the checksum of the inner packet with checksum offload.
>>>> Set the udp checksum of the vxlan header is 0.
>>>>
>>> I see. It sounds like the device is really verifying two checksums in
>>> the packet, the outer UDP checksum (which is zero for UDP) and an
>>> inner checksum, but only reporting one checksum was verified. The
>>> driver needs to set csum_level to 1 in this case (meaning two
>>> checksums have been verified for checksum-unnecessary). What NIC are
>>> you using?
>>
>> Currently it is 82599, whose driver can't recognize the vxlan packet.
>> I guess so many NICs can't do this checking in it's driver, so I think this is a
>> common case, will we fix it in stack?
>>
> Mao,
> 
> The problem isn't in the stack, it seems to be in the driver. If the
> device reports a verified checksum for an encpasulated packet then the
> driver needs to set csum_level to 1. Otherwise, the stack can't just
> assume that the inner checksum was verified. *How* a driver deduces
> that the device is reporting about an encapsulated checksum is
> specific to the device and its driver. I'm not sure which driver your
> running, but if you search the code there should be something like
> "skb->csum_level =1" that would be a clue about support. A good
> example is ixgbe, if the device reports checksum verified and that
> packet was VXLAN, it deduces that the inner checksum was verified and
> so the driver sets CHECKSUM_UNNECESSARY and skb->csum_level. Of course
> all this complexity goes away when devices just provide
> checksum-complete.
> 
> Tom

Hi Tom,

I have checked code in ixgbe_main.c ixgbe_rx_checksum(). UDP frames with a 0 checksum can be marked as
checksum errors. If it returns here, skb->ip_summed will be set to CHECKSUM_NONE.
Then the packet will be flush to stack in udp_gro_receive().
Do you think whether it is the fault of driver or not?

ixgbe_rx_checksum():
	if (ixgbe_test_staterr(rx_desc, IXGBE_RXDADV_ERR_TCPE)) {
		/*
		 * 82599 errata, UDP frames with a 0 checksum can be marked as
		 * checksum errors.
		 */
		if ((pkt_info & cpu_to_le16(IXGBE_RXDADV_PKTTYPE_UDP)) &&
		    test_bit(__IXGBE_RX_CSUM_UDP_ZERO_ERR, &ring->state))
			return;

		ring->rx_stats.csum_err++;
		return;
	}

udp_gro_receive():
if (NAPI_GRO_CB(skb)->encap_mark ||
	    (skb->ip_summed != CHECKSUM_PARTIAL &&
	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
	     !NAPI_GRO_CB(skb)->csum_valid) ||
	    !udp_sk(sk)->gro_receive)
		goto out_unlock;

> 
>>>
>>> Tom
>>>
>>>
>>>> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
>>>>
>>>>>
>>>>> Tom
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
>>>>>>             !udp_sk(sk)->gro_receive)
>>>>>>                 goto out_unlock;
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ