[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHXqBFJ0OfX8TN9ZbyUAM5QbNB=CKFHCJXSWfhens2tMyMTxhw@mail.gmail.com>
Date: Mon, 30 Jan 2012 09:25:11 +0100
From: Michał Mirosław <mirqus@...il.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Or Gerlitz <ogerlitz@...lanox.com>,
Roland Dreier <roland@...nel.org>,
Herbert Xu <herbert@...dor.apana.org.au>, davem@...emloft.net,
linux-rdma <linux-rdma@...r.kernel.org>,
Shlomo Pongratz <shlomop@...lanox.com>, netdev@...r.kernel.org
Subject: Re: [PATCH 2/2] IB/ipoib: fix GRO merge failure for IPoIB originated
TCP streams
2012/1/30 Eric Dumazet <eric.dumazet@...il.com>:
> Le lundi 30 janvier 2012 à 09:44 +0200, Or Gerlitz a écrit :
>> On 1/30/2012 6:36 AM, Roland Dreier wrote:
>> > On Thu, Jan 26, 2012 at 6:43 AM, Or Gerlitz<ogerlitz@...lanox.com> wrote:
>> >> The GRO flow makes a check in every layer to ensure the packets
>> >> are actually merged only if they match at all layers.
>> >>
>> >> The first GRO check, at L2 always fails for IPoIB, since it assumes
>> >> that all packets have 14 bytes of Ethernet link layer header. Using the
>> >> IPoIB header will not help here either, since its only four bytes. To
>> >> overcome this, the skb mac header pointer is set to an area within the
>> >> packet IB GRH headroom, such that later, the L2 check done by GRO
>> >> succeeds and it can move to checks at the network and transport layers.
>> >
>> >> --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> >> +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
>> >> @@ -286,10 +287,20 @@ static void ipoib_ib_handle_rx_wc(struct net_device *dev, struct ib_wc *wc)
>> >> else
>> >> skb->pkt_type = PACKET_MULTICAST;
>> >>
>> >> - skb_pull(skb, IB_GRH_BYTES);
>> >> + /*
>> >> + * GRO first does L2 compares (14 bytes). We must not let it start from
>> >> + * the IPoIB header as ten octets of the IP header, containing fields
>> >> + * which vary from packet to packet will cause non-merging of packets.
>> >> + * from the same TCP stream.
>> >> + */
>> >> + psgid = skb_pull(skb, offsetof(struct ib_grh, sgid));
>> >> + /* if there's no GRH, that area could contain random data */
>> >> + if (!(wc->wc_flags& IB_WC_GRH))
>> >> + memset(psgid, 0, 16);
>> >> + skb_reset_mac_header(skb);
>> >> + skb_pull(skb, IB_GRH_BYTES - offsetof(struct ib_grh, sgid));
>> >>
>> >> skb->protocol = ((struct ipoib_header *) skb->data)->proto;
>> >> - skb_reset_mac_header(skb);
>> >
>> > This seems like a really weird place to fix this. Wouldn't it
>> > make more sense to fix the GRO check to handle non-ethernet L2 headers?
>>
>> Yes, we can do that as well. Herbert, Dave, would it be enough here, to
>> skip the Ethernet header and vlan comparison for skbs whose associated
>> netdevice type isn't ARPHRD_ETHER? e.g something along the lines of:
>>
>> > diff --git a/net/core/dev.c b/net/core/dev.c
>> > index 115dee1..c529f5a 100644
>> > --- a/net/core/dev.c
>> > +++ b/net/core/dev.c
>> > @@ -3505,9 +3505,11 @@ __napi_gro_receive(struct napi_struct *napi,
>> > struct sk_buff *skb)
>> > unsigned long diffs;
>> >
>> > diffs = (unsigned long)p->dev ^ (unsigned long)skb->dev;
>> > - diffs |= p->vlan_tci ^ skb->vlan_tci;
>> > - diffs |= compare_ether_header(skb_mac_header(p),
>> > - skb_gro_mac_header(skb));
>> > + if (!diffs && p->dev->type == ARPHRD_ETHER) {
>> > + diffs |= p->vlan_tci ^ skb->vlan_tci;
>> > + diffs |= compare_ether_header(skb_mac_header(p),
>> > +
>> > skb_gro_mac_header(skb));
>> > + }
>> > NAPI_GRO_CB(p)->same_flow = !diffs;
>> > NAPI_GRO_CB(p)->flush = 0;
>
> Hmm, do we really need to compare ether header, thats the question.
>
> IMHO, GRO could avoid this check, as legal trafic could be never merged
> (eg multipath)
This would allow injecting data to the connection by other host on the
same LAN. GRO does coalescing before any L3 anti-spoofing checks (eg.
rpfilter) are done, doesn't it?
Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists