[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20181008112753.GN3823@gauss3.secunet.de>
Date: Mon, 8 Oct 2018 13:27:53 +0200
From: Steffen Klassert <steffen.klassert@...unet.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
CC: Paolo Abeni <pabeni@...hat.com>,
Network Development <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
"Willem de Bruijn" <willemb@...gle.com>
Subject: Re: [PATCH net-next RFC 0/8] udp and configurable gro
On Fri, Oct 05, 2018 at 10:41:47AM -0400, Willem de Bruijn wrote:
> On Fri, Oct 5, 2018 at 9:53 AM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > Hi all,
> >
> > On Fri, 2018-09-14 at 13:59 -0400, Willem de Bruijn wrote:
> > > This is a *very rough* draft. Mainly for discussion while we also
> > > look at another partially overlapping approach [1].
> >
> > I'm wondering how we go on from this ? I'm fine with either approaches.
>
> Let me send the udp gro static_key patch. Then we don't need
> the enable udp on demand logic (patch 2/4).
>
> Your implementation of GRO is more fleshed out (patch 3/4) than
> my quick hack. My only request would be to use a separate
> UDP_GRO socket option instead of adding this to the existing
> UDP_SEGMENT.
>
> Sounds good?
>
> > Also, I'm interested in [try to] enable GRO/GSO batching in the
> > forwarding path, as you outlined initially in the GSO series
> > submission. That should cover Steffen use-case, too, right?
>
> Great. Indeed. Though there is some unresolved discussion on
> one large gso skb vs frag list. There has been various concerns
> around the use of frag lists for GSO in the past, and it does not
> match h/w offload. So I think the answer would be the first unless
> the second proves considerably faster (in which case it could also
> be added later as optimization).
I think it depends a bit on the usecase and hardware etc.
if the first or the second approach is faster. So it would
be good if we can choose which one to use depending on that.
For local socket receiving, building big GSO packets
is likely faster than the chaining method.
But on forwarding the chaining method might be faster
because we don't have the overhead of creating GSO
packets and of segmenting them back to their native
form (at least as long as we don't have NICs that
support hardware UDP GSO). Same applies to packets
that undergo IPsec transformation.
Another thing where the chaining method could be intersting
is when we receive already big LRO or HW GRO packets from the
NIC. Packets of the same flow could still travel together
through the stack with the chaining method. I've never tried
this, though. For now it is just an idea.
I have the code for the chaining mehthod here, I'd just need
some method to hook it in. Maybe it could be done with
some sort of an inet_update_offload() as Paolo already
propsed in his pachset, or we could make it configurable
per device...
Powered by blists - more mailing lists