lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S37fpFimxrQ1hfvhDs7ukOvNKmm1jHYmx00APYfif3-OMQ@mail.gmail.com>
Date:   Wed, 30 Jan 2019 20:33:22 -0800
From:   Tom Herbert <tom@...bertland.com>
To:     maowenan <maowenan@...wei.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next] net: udp Allow CHECKSUM_UNNECESSARY packets to
 do GRO.

On Wed, Jan 30, 2019 at 6:59 PM maowenan <maowenan@...wei.com> wrote:
>
>
>
> On 2019/1/31 10:43, Tom Herbert wrote:
> > On Wed, Jan 30, 2019 at 5:58 PM maowenan <maowenan@...wei.com> wrote:
> >>
> >>
> >>
> >> On 2019/1/30 4:24, Tom Herbert wrote:
> >>> On Tue, Jan 29, 2019 at 12:08 AM maowenan <maowenan@...wei.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 2019/1/29 14:24, Tom Herbert wrote:
> >>>>> On Mon, Jan 28, 2019 at 10:04 PM maowenan <maowenan@...wei.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 2019/1/29 12:01, Tom Herbert wrote:
> >>>>>>> On Mon, Jan 28, 2019 at 7:00 PM maowenan <maowenan@...wei.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>> Do you have any comments about this change?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 2019/1/23 11:33, Mao Wenan wrote:
> >>>>>>>>> When udp4_gro_receive() get one packet that uh->check=0,
> >>>>>>>>> skb_gro_checksum_validate_zero_check() will set the
> >>>>>>>>> skb->ip_summed = CHECKSUM_UNNECESSARY;
> >>>>>>>>> skb->csum_level = 0;
> >>>>>>>>> Then udp_gro_receive() will flush the packet which is not CHECKSUM_PARTIAL,
> >>>>>>>>> It is not our expect,  because check=0 in udp header indicates this
> >>>>>>>>> packet is no need to caculate checksum, we should go further to do GRO.
> >>>>>>>>>
> >>>>>>>>> This patch changes the value of csum_cnt according to skb->csum_level.
> >>>>>>>>> ---
> >>>>>>>>>  include/linux/netdevice.h | 1 +
> >>>>>>>>>  1 file changed, 1 insertion(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> >>>>>>>>> index 1377d08..9c819f1 100644
> >>>>>>>>> --- a/include/linux/netdevice.h
> >>>>>>>>> +++ b/include/linux/netdevice.h
> >>>>>>>>> @@ -2764,6 +2764,7 @@ static inline void skb_gro_incr_csum_unnecessary(struct sk_buff *skb)
> >>>>>>>>>                * during GRO. This saves work if we fallback to normal path.
> >>>>>>>>>                */
> >>>>>>>>>               __skb_incr_checksum_unnecessary(skb);
> >>>>>>>>> +             NAPI_GRO_CB(skb)->csum_cnt = skb->csum_level + 1;
> >>>>>>>
> >>>>>>> That doesn't look right. This would be reinitializing the GRO
> >>>>>>> checksums from the beginning.
> >>>>>>>
> >>>>>>>>>       }
> >>>>>>>>>  }
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> I assume the code is bailing on this conditional:
> >>>>>>>
> >>>>>>> if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>>>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>>>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>>>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>>>>>             !udp_sk(sk)->gro_receive)
> >>>>>>>                 goto out_unlock;
> >>>>>>>
> >>>>>>> I am trying to remember why this needs to check csum_cnt. If there was
> >>>>>>> a csum_cnt for the UDP csum being zero from checksum-unnecessary, it
> >>>>>>> was consumed by skb_gro_checksum_validate_zero_check in UDP4 GRO
> >>>>>>> received.
> >>>>>>
> >>>>>> We have met the scene about two VMs in different host with vxlan packets, when udp4_gro_receive receives
> >>>>>> one packet with ip_summed=CHECKSUM_NONE,csum_cnt=0,csum_valid=0,and udp->check=0, then skb_gro_checksum_validate_zero_check()->
> >>>>>> skb_gro_incr_csum_unnecessary() validate it and set ip_summed=CHECKSUM_UNNECESSARY,csum_level=0, but csum_cnt and csum_valid
> >>>>>> keep zero value. Then it will be flushed in udp_gro_receive(), the codes as you have showed.
> >>>>>>
> >>>>>> so I think it forgets to modify csum_cnt since csum_level is changed in skb_gro_incr_csum_unnecessary()->__skb_incr_checksum_unnecessary().
> >>>>>>
> >>>>> Yes, but the csum_level is changing since we've gone beyond the
> >>>>> checksums initially reported inc checksum-unnecessary. GRO csum_cnt is
> >>>>> initialized to skb->csum_level + 1 at the start of GRO processing.
> >>>>>
> >>>>> If I recall, the rule is that UDP GRO requires at least one non-zero
> >>>>> checksum to be verified. The idea is that if we end up computing
> >>>>> packet checksums on the host for inner checksums like TCP during GRO,
> >>>>> then that's negating the performance benefits of GRO. Had UDP check
> >>>>> not been zero then we would do checksum unnecessary conversion and so
> >>>>> csum_valid would be set for the remainded of GRO processing. The
> >>>>> existing code is following the rule I believe, so this may be working
> >>>>> as intended.
> >>>>
> >>>> Do you have any suggestion if I need do GRO as udp->check is zero?
> >>>> My previous modification which works fine as below:
> >>>>         if (NAPI_GRO_CB(skb)->encap_mark ||
> >>>>             (skb->ip_summed != CHECKSUM_PARTIAL &&
> >>>> +            skb->ip_summed != CHECKSUM_UNNECESSARY &&
> >>>
> >>> That's effectively disabling the rule that we need a real checksum
> >>> calculation to proceed with GRO. Besides that, the device returning
> >>> one checksum-unnecessary level because UDP csum is zero is pretty
> >>> pointelss; we can just as easily deduce get to same state just by
> >>> looking at the field with CHECKSUM_NONE. What we really want to see
> >>> for GRO is a real checksum computation being done on the packet.
> >>>
> >>> A few questions:
> >>>
> >>> What type of packets are being GROed? Are these TCP? What performance
> >>> difference do you see with our patch? Can you try enabling UDP
> >>> checksums, and even RCO with VXLAN? With UDP encapsulation we
> >>> generally see better performance with checksum enabled since UDP
> >>> checksum offload is ubiquitous and we can easily convert
> >>> checksum-unnecessary (with non-zero csum) to checksum-complete.
> >>
> >> We use the physical network card calculate the checksum of the inner packet with checksum offload.
> >> Set the udp checksum of the vxlan header is 0.
> >>
> > I see. It sounds like the device is really verifying two checksums in
> > the packet, the outer UDP checksum (which is zero for UDP) and an
> > inner checksum, but only reporting one checksum was verified. The
> > driver needs to set csum_level to 1 in this case (meaning two
> > checksums have been verified for checksum-unnecessary). What NIC are
> > you using?
>
> Currently it is 82599, whose driver can't recognize the vxlan packet.
> I guess so many NICs can't do this checking in it's driver, so I think this is a
> common case, will we fix it in stack?
>
Mao,

The problem isn't in the stack, it seems to be in the driver. If the
device reports a verified checksum for an encpasulated packet then the
driver needs to set csum_level to 1. Otherwise, the stack can't just
assume that the inner checksum was verified. *How* a driver deduces
that the device is reporting about an encapsulated checksum is
specific to the device and its driver. I'm not sure which driver your
running, but if you search the code there should be something like
"skb->csum_level =1" that would be a clue about support. A good
example is ixgbe, if the device reports checksum verified and that
packet was VXLAN, it deduces that the inner checksum was verified and
so the driver sets CHECKSUM_UNNECESSARY and skb->csum_level. Of course
all this complexity goes away when devices just provide
checksum-complete.

Tom

> >
> > Tom
> >
> >
> >> With this patch, the bandwidth of TCP between two VMs increase from 2Gbit/s to 6Gbit/s.
> >>
> >>>
> >>> Tom
> >>>
> >>>
> >>>
> >>>
> >>>>              NAPI_GRO_CB(skb)->csum_cnt == 0 &&
> >>>>              !NAPI_GRO_CB(skb)->csum_valid) ||
> >>>>             !udp_sk(sk)->gro_receive)
> >>>>                 goto out_unlock;
> >>>>
> >>>>
> >>>>>
> >>>>> Tom
> >>>>>
> >>>>>>>
> >>>>>>> .
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>> .
> >>>>>
> >>>>
> >>>
> >>> .
> >>>
> >>
> >
> > .
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ