lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 21 Apr 2016 05:59:06 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Steffen Klassert <steffen.klassert@...unet.com>
Cc:	Sowmini Varadhan <sowmini.varadhan@...cle.com>,
	netdev@...r.kernel.org
Subject: Re: [RFC PATCH] gro: Partly revert "net: gro: allow to build full
 sized skb"

On Thu, 2016-04-21 at 09:40 +0200, Steffen Klassert wrote:
> This partly reverts the below mentioned patch because on
> forwarding, such skbs can't be offloaded to a NIC.
> 
> We need this to get IPsec GRO for forwarding to work properly,
> otherwise the GRO aggregated packets get segmented again by
> the GSO layer. Although discovered when implementing IPsec GRO,
> this is a general problem in the forwarding path.
> 
> -------------------------------------------------------------------------
> commit 8a29111c7ca68d928dfab58636f3f6acf0ac04f7
> Author: Eric Dumazet <edumazet@...gle.com>
> Date:   Tue Oct 8 09:02:23 2013 -0700
> 
>     net: gro: allow to build full sized skb
> 
>     skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
>     typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.
> 
>     It's relatively easy to extend the skb using frag_list to allow
>     more frags to be appended into the last sk_buff.
> 
>     This still builds very efficient skbs, and allows reaching 45 MSS per
>     skb.
> 
>     (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
>     sk_buff)
> 
>     High speed TCP flows benefit from this extension by lowering TCP stack
>     cpu usage (less packets stored in receive queue, less ACK packets
>     processed)
> 
>     Forwarding setups could be hurt, as such skbs will need to be
>     linearized, although its not a new problem, as GRO could already
>     provide skbs with a frag_list.
> 
>     We could make the 65536 bytes threshold a tunable to mitigate this.
> 
>     (First time we need to linearize skb in skb_needs_linearize(), we could
>     lower the tunable to ~16*1460 so that following skb_gro_receive() calls
>     build smaller skbs)
> 
>     Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>     Signed-off-by: David S. Miller <davem@...emloft.net>
> ---------------------------------------------------------------------------
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@...unet.com>
> ---
> 
> Hi Eric, this is a followup on our discussion at the netdev
> conference. Would you still be ok with this revert, or do
> you think there is a better solution in sight?

Note that some GRO enabled drivers would still generate frag_list.

(This happens if they are using skb with some TCP payload in skb->head
and skb->head was allocated with kmalloc())

We have sysctl_max_skb_frags sysctl, we might have a sysctl
enabling/disabling GRO from building any frag_list.
Or simply reuse an existing one, like /proc/sys/net/ipv4/ip_forward ?)

Here at Google, we increased MAX_SKB_FRAGS, but this is a rather
intrusive change to be upstreamed :(



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ