[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1477582016.7065.212.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Thu, 27 Oct 2016 08:26:56 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Jon Maxwell <jmaxwell37@...il.com>
Cc: tlfalcon@...ux.vnet.ibm.com, benh@...nel.crashing.org,
paulus@...ba.org, mpe@...erman.id.au, davem@...emloft.net,
tom@...bertland.com, jarod@...hat.com, hofrat@...dl.org,
netdev@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-kernel@...r.kernel.org, mleitner@...hat.com,
jmaxwell@...hat.com
Subject: Re: [PATCH net-next] ibmveth: v1 calculate correct gso_size and set
gso_type
On Wed, 2016-10-26 at 11:09 +1100, Jon Maxwell wrote:
> We recently encountered a bug where a few customers using ibmveth on the
> same LPAR hit an issue where a TCP session hung when large receive was
> enabled. Closer analysis revealed that the session was stuck because the
> one side was advertising a zero window repeatedly.
>
> We narrowed this down to the fact the ibmveth driver did not set gso_size
> which is translated by TCP into the MSS later up the stack. The MSS is
> used to calculate the TCP window size and as that was abnormally large,
> it was calculating a zero window, even although the sockets receive buffer
> was completely empty.
>
> We were able to reproduce this and worked with IBM to fix this. Thanks Tom
> and Marcelo for all your help and review on this.
>
> The patch fixes both our internal reproduction tests and our customers tests.
>
> Signed-off-by: Jon Maxwell <jmaxwell37@...il.com>
> ---
> drivers/net/ethernet/ibm/ibmveth.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
> index 29c05d0..c51717e 100644
> --- a/drivers/net/ethernet/ibm/ibmveth.c
> +++ b/drivers/net/ethernet/ibm/ibmveth.c
> @@ -1182,6 +1182,8 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
> int frames_processed = 0;
> unsigned long lpar_rc;
> struct iphdr *iph;
> + bool large_packet = 0;
> + u16 hdr_len = ETH_HLEN + sizeof(struct tcphdr);
>
> restart_poll:
> while (frames_processed < budget) {
> @@ -1236,10 +1238,28 @@ static int ibmveth_poll(struct napi_struct *napi, int budget)
> iph->check = 0;
> iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
> adapter->rx_large_packets++;
> + large_packet = 1;
> }
> }
> }
>
> + if (skb->len > netdev->mtu) {
> + iph = (struct iphdr *)skb->data;
> + if (be16_to_cpu(skb->protocol) == ETH_P_IP &&
> + iph->protocol == IPPROTO_TCP) {
> + hdr_len += sizeof(struct iphdr);
> + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV4;
> + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len;
> + } else if (be16_to_cpu(skb->protocol) == ETH_P_IPV6 &&
> + iph->protocol == IPPROTO_TCP) {
> + hdr_len += sizeof(struct ipv6hdr);
> + skb_shinfo(skb)->gso_type = SKB_GSO_TCPV6;
> + skb_shinfo(skb)->gso_size = netdev->mtu - hdr_len;
> + }
> + if (!large_packet)
> + adapter->rx_large_packets++;
> + }
> +
>
This might break forwarding and PMTU discovery.
You force gso_size to device mtu, regardless of real MSS used by the TCP
sender.
Don't you have the MSS provided in RX descriptor, instead of guessing
the value ?
Powered by blists - more mailing lists