[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1461243546.7627.15.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Thu, 21 Apr 2016 05:59:06 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Steffen Klassert <steffen.klassert@...unet.com>
Cc: Sowmini Varadhan <sowmini.varadhan@...cle.com>,
netdev@...r.kernel.org
Subject: Re: [RFC PATCH] gro: Partly revert "net: gro: allow to build full
sized skb"
On Thu, 2016-04-21 at 09:40 +0200, Steffen Klassert wrote:
> This partly reverts the below mentioned patch because on
> forwarding, such skbs can't be offloaded to a NIC.
>
> We need this to get IPsec GRO for forwarding to work properly,
> otherwise the GRO aggregated packets get segmented again by
> the GSO layer. Although discovered when implementing IPsec GRO,
> this is a general problem in the forwarding path.
>
> -------------------------------------------------------------------------
> commit 8a29111c7ca68d928dfab58636f3f6acf0ac04f7
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Tue Oct 8 09:02:23 2013 -0700
>
> net: gro: allow to build full sized skb
>
> skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
> typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.
>
> It's relatively easy to extend the skb using frag_list to allow
> more frags to be appended into the last sk_buff.
>
> This still builds very efficient skbs, and allows reaching 45 MSS per
> skb.
>
> (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
> sk_buff)
>
> High speed TCP flows benefit from this extension by lowering TCP stack
> cpu usage (less packets stored in receive queue, less ACK packets
> processed)
>
> Forwarding setups could be hurt, as such skbs will need to be
> linearized, although its not a new problem, as GRO could already
> provide skbs with a frag_list.
>
> We could make the 65536 bytes threshold a tunable to mitigate this.
>
> (First time we need to linearize skb in skb_needs_linearize(), we could
> lower the tunable to ~16*1460 so that following skb_gro_receive() calls
> build smaller skbs)
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Signed-off-by: David S. Miller <davem@...emloft.net>
> ---------------------------------------------------------------------------
>
> Signed-off-by: Steffen Klassert <steffen.klassert@...unet.com>
> ---
>
> Hi Eric, this is a followup on our discussion at the netdev
> conference. Would you still be ok with this revert, or do
> you think there is a better solution in sight?
Note that some GRO enabled drivers would still generate frag_list.
(This happens if they are using skb with some TCP payload in skb->head
and skb->head was allocated with kmalloc())
We have sysctl_max_skb_frags sysctl, we might have a sysctl
enabling/disabling GRO from building any frag_list.
Or simply reuse an existing one, like /proc/sys/net/ipv4/ip_forward ?)
Here at Google, we increased MAX_SKB_FRAGS, but this is a rather
intrusive change to be upstreamed :(
Powered by blists - more mailing lists