[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+FuTSc3sOuPsQ3sJSCudCwZky4FcGF5CopejURmGZUSjXEn3Q@mail.gmail.com>
Date: Mon, 13 Jan 2020 11:21:07 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Steffen Klassert <steffen.klassert@...unet.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
David Miller <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>,
Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Network Development <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 3/4] net: Support GRO/GSO fraglist chaining.
On Mon, Jan 13, 2020 at 3:51 AM Steffen Klassert
<steffen.klassert@...unet.com> wrote:
>
> On Thu, Dec 19, 2019 at 11:28:34AM -0500, Willem de Bruijn wrote:
> > On Thu, Dec 19, 2019 at 3:22 AM Steffen Klassert
> > <steffen.klassert@...unet.com> wrote:
> > >
> > > I tried to find the subset of __copy_skb_header() that is needed to copy.
> > > Some fields of nskb are still valid, and some (csum related) fields
> > > should not be copied from skb to nskb.
> >
> > Duplicating that code is kind of fragile wrt new fields being added to
> > skbs later (such as the recent skb_ext example).
> >
> > Perhaps we can split __copy_skb_header further and call the
> > inner part directly.
>
> I thought already about that, but __copy_skb_header does a
> memcpy over all the header fields. If we split this, we
> would need change the memcpy to direct assignments.
Okay, if any of those fields should not be overwritten in this case,
then that's not an option. That memcpy is probably a lot more
efficient than all the direct assignments.
> Maybe we can be conservative here and do a full
> __copy_skb_header for now. The initial version
> does not necessarily need to be the most performant
> version. We could try to identify the correct subset
> of header fields later then.
We should probably aim for the right set from the start. If you think
this set is it, let's keep it.
Could you add an explicit comment that this is a subset of
__copy_skb_header. That might help remind us of this partial duplicate
whenever updating that function.
> > > I had to set ip_summed to CHECKSUM_UNNECESSARY on GRO to
> > > make sure the noone touches the checksum of the head
> > > skb. Otherise netfilter etc. tries to touch the csum.
> > >
> > > Before chaining I make sure that ip_summed and csum_level is
> > > the same for all chained skbs and here I restore the original
> > > value from nskb.
> >
> > This is safe because the skb_gro_checksum_validate will have validated
> > already on CHECKSUM_PARTIAL? What happens if there is decap or encap
> > in the path? We cannot revert to CHECKSUM_PARTIAL after that, I
> > imagine.
>
> Yes, the checksum is validated with skb_gro_checksum_validate. If the
> packets are UDP encapsulated, they are segmented before decapsulation.
> Original values are already restored. If an additional encapsulation
> happens, the encap checksum will be calculated after segmentation.
> Original values are restored before that.
I was wondering more about additional other encapsulation protocols.
>From a quick read, it seems like csum_level is associated only with
CHECKSUM_UNNECESSARY.
What if a device returns CHECKSUM_COMPLETE for packets with a tunnel
that is decapsulated before forwarding. Say, just VLAN. That gets
untagged in __netif_receive_skb_core with skb_vlan_untag calling
skb_pull_rcsum. After segmentation the ip_summed is restored, with
skb->csum still containing the unmodified csum that includes the VLAN
tag?
>
> >
> > Either way, would you mind briefly documenting the checksum behavior
> > in the commit message? It's not trivial and I doubt I'll recall the
> > details in six months.
>
> Yes, can do this.
>
> > Really about patch 4: that squashed in a lot of non-trivial scaffolding
> > from previous patch 'UDP: enable GRO by default'. Does it make sense
> > to keep that in a separate patch? That should be a noop, which we can
> > verify. And it makes patch 4 easier to reason about on its own, too.
>
> Patch 4 is not that big, so not sure it that makes really sense.
> But I can split out a preparation patch if that is preferred.
Thanks. If it's not too complex. I do think that it will help make the
non-obvious functional change stand out from the larger noop code
refactor. But perhaps it's indeed nitpicking. Leave as is if you
prefer, for sure.
> Anyway, I likely do another RFC version because we are already
> late in the development cycle.
The feature is now disabled by default, so it could definitely go in
late in the cycle and allow us to find and fix bugs during
stabilization. Up to you, obviously!
Powered by blists - more mailing lists