[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1370030944.2703.17.camel@bwh-desktop.uk.level5networks.com>
Date: Fri, 31 May 2013 21:09:04 +0100
From: Ben Hutchings <bhutchings@...arflare.com>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>, Jerry Chu <hkchu@...gle.com>,
Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [PATCH net-next] gro: should aggregate frames without DF
On Fri, 2013-05-31 at 10:02 -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@...gle.com>
>
> GRO on IPv4 doesn't aggregate frames if they don't have DF bit set.
>
> Some servers use IP_MTU_DISCOVER/IP_PMTUDISC_PROBE, so linux receivers
> are unable to aggregate this kind of traffic.
>
> The right thing to do is to allow aggregation as long as the DF bit has
> same value on all segments.
>
> bnx2x LRO does this correctly.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Jerry Chu <hkchu@...gle.com>
> Cc: Herbert Xu <herbert@...dor.apana.org.au>
> ---
> net/ipv4/af_inet.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index b05ae96..328cc62 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -1384,9 +1384,9 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
> if (unlikely(ip_fast_csum((u8 *)iph, 5)))
> goto out_unlock;
>
> - id = ntohl(*(__be32 *)&iph->id);
> - flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (id ^ IP_DF));
> - id >>= 16;
> + flush = ntohs(iph->tot_len) ^ skb_gro_len(skb);
> +
> + id = ntohs(iph->id);
>
> for (p = *head; p; p = p->next) {
> struct iphdr *iph2;
> @@ -1407,6 +1407,7 @@ static struct sk_buff **inet_gro_receive(struct sk_buff **head,
> NAPI_GRO_CB(p)->flush |=
> (iph->ttl ^ iph2->ttl) |
> (iph->tos ^ iph2->tos) |
> + ((iph->frag_off ^ iph2->frag_off) & htons(IP_DF)) |
But this results in ignoring the actual offset bits of frag_off!
We should allow merging only if all packets have frag_off == IP_DF or
all have frag_off == 0. The first assignment of flush therefore still
needs to check the combined id/frag_off word, but using (id & ~IP_DF)
instead of (id ^ IP_DF).
Ben.
> ((u16)(ntohs(iph2->id) + NAPI_GRO_CB(p)->count) ^ id);
>
> NAPI_GRO_CB(p)->flush |= flush;
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists