[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1383091610.1534.29.camel@bwh-desktop.uk.level5networks.com>
Date: Wed, 30 Oct 2013 00:06:50 +0000
From: Ben Hutchings <bhutchings@...arflare.com>
To: David Miller <davem@...emloft.net>
CC: <eric.dumazet@...il.com>, <christoph.paasch@...ouvain.be>,
<herbert@...dor.apana.org.au>, <netdev@...r.kernel.org>,
<hkchu@...gle.com>, <mwdalton@...gle.com>
Subject: Re: [PATCH v2 net-next] net: introduce gro_frag_list_enable sysctl
On Tue, 2013-10-29 at 19:44 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Tue, 29 Oct 2013 08:12:35 -0700
>
> > From: Eric Dumazet <edumazet@...gle.com>
> >
> > Christoph Paasch and Jerry Chu reported crashes in skb_segment() caused
> > by commit 8a29111c7ca6 ("net: gro: allow to build full sized skb")
> >
> > (Jerry is working on adding native GRO support for tunnels)
> >
> > skb_segment() only deals with a frag_list chain containing MSS sized
> > fragments.
> >
> > This patch adds support any kind of frag, and adds a new sysctl,
> > as clearly the GRO layer should avoid building frag_list skbs
> > on a router, as the segmentation is adding cpu overhead.
> >
> > Note that we could try to reuse page fragments instead of doing
> > copy to linear skbs, but this requires a fair amount of work,
> > and possible truesize nightmares, as we do not track individual
> > (per page fragment) truesizes.
> >
> > /proc/sys/net/core/gro_frag_list_enable possible values are :
> >
> > 0 : GRO layer is not allowed to use frag_list to extend skb capacity
> > 1 : GRO layer is allowed to use frag_list, but skb_segment()
> > automatically sets the sysctl to 0.
> > 2 : GRO is allowed to use frag_list, and skb_segment() wont
> > clear the sysctl.
> >
> > Default value is 1 : automatic discovery
> >
> > Reported-by: Christoph Paasch <christoph.paasch@...ouvain.be>
> > Reported-by: Jerry Chu <hkchu@...gle.com>
> > Cc: Michael Dalton <mwdalton@...gle.com>
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > ---
> > v2: added missing sysctl definition in skbuff.c
>
> I do not like the idea of packet actions indirectly changing sysctl
> values, even if you document it sufficiently as you have here.
>
> Plus this puts the sysctl change logic in a fast path.
>
> I would suggest instead making it change in response to changes to
> ip_forward, as we do with per-device LRO settings. This means that,
> like ip_forward, you should also make this sysctl a global + devinet
> per-device sysctl.
>
> You might even emit a pr_info() when this logic triggers, and if you
> are ambitious enough keep track of the previous GRO sysctl state so
> you can restore it if ip_forward is set back to zero.
Speaking of which: insteading of disabling LRO once, we really ought to
keep count of the forwarders (IPv4 routing, IPv6 routing, bridge) and
use that to mask out LRO in netdev_fix_features().
I think that the forwarder count is also needed for this.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists