[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S36wJdFnq=2ZkTfijqro1TM1GTqap=aLQ0+NszgWC6Sznw@mail.gmail.com>
Date: Fri, 11 Mar 2016 14:55:33 -0800
From: Tom Herbert <tom@...bertland.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: Edward Cree <ecree@...arflare.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Generic TSO (was Re: [net-next PATCH 0/2] GENEVE/VXLAN: Enable
outer Tx checksum by default)
On Fri, Mar 11, 2016 at 2:31 PM, Alexander Duyck
<alexander.duyck@...il.com> wrote:
> On Fri, Mar 11, 2016 at 1:29 PM, Edward Cree <ecree@...arflare.com> wrote:
>> On 11/03/16 21:09, Alexander Duyck wrote:
>>> The only real issue with the "generic" TSO is that it isn't going to
>>> be so generic. We have different devices that will support different
>>> levels of stuff. For example the ixgbe drivers will need to treat the
>>> outer tunnel header as one giant L2 header. As a result we will need
>>> to populate all the fields in the outer header including the outer IP
>>> ID, checksum, udp->len, and UDP or GRE checksum if requested. For
>>> i40e I think this gets a bit simpler as they already handle the outer
>>> IPv4 ID and checksum. I think there we may need to only populate the
>>> checksum for it to work out correctly. As such I may look at coming
>>> up with a number of functions so that we can mix and match based on
>>> what is needed in order to assemble a partially segmented frame.
>> AIUI, the point of the design is that we _can_ populate everything,
>> because we're keeping lengths and outer IP ID fixed, so outer checksums
>> stay the same and the outer tunnel header _is_ just one giant L2 header
>> which is bit-for-bit identical for each generated segment. So every
>> devicegets to just be dumb and treat it as opaque.
>
> This works so long as the device isn't trying to do anything like
> insert VLAN tags. Then I think we might have an issue since we don't
> want to confuse the device and have it trying to insert the tag on the
> inner frame's Ethernet header.
>
In Edward's giant L2 header mode, couldn't VLAN tags just be part of that?
> I suspect we may have differing levels of "dumb" that we have to deal
> with. That is all I am saying. By default we could just populate all
> of the length and checksum fields in the outer header, we would just
> have to be consistent about what is provided then. In addition there
> will be the matter of sorting out the IP ID bits. For example some of
> the i40e parts support tunnel offloads, but not tunnel offloads with
> checksums enabled. I suspect those parts will end up wanting to
> handle the outer IP header and UDP length values. As a result there
> trying to do a "dumb" send may result in us really messing up the IP
> ID values if we don't take steps to make it a bit smarter.
>
>>> The other issue I am working on at the moment to enable all this is to
>>> fix the differents between csum_tcpudp_magic and csum_ipv6_magic in
>>> terms of handling packet lengths greater than 65535. Currently we are
>>> messing up the checksum in relation to IPv6 since we are using the
>>> truncated uh->len value. I'll be submitting some patches later today
>>> that will hopefully get that fixed and that in turn should make the
>>> rest of the segmentation work easier.
>> Again, in the superpacket we want to calculate the checksum based on the
>> subsegment length, rather than the length of the superpacket. The idea
>> is to create the header for an MSS-sized segment, then follow it with an
>> inner IP & TCP header, and n*MSS bytes of payload. (This of course
>> produces a superpacket that's not what you'd send over a link with a 64k
>> MTU, unlike how we do it today.)
>
> The question is at what point do we do the chopping. Should we be
> doing this in the drivers or somewhere higher in the stack like we do
> for standard GSO segmentation. I would think we would need to add
> another bit that says we can do GSO with custom outer headers since I
> can see VLANs being a possible issue otherwise.
>
>> Then hw just chops up the payload, fixes up the inner headers, and glues
>> the "L2" header on each packet.
>
> Yea, it sounds really straight forward and easy. It isn't till you
> start digging into the actual code that it gets a bit hairy.
>
> What this effectively is is another form of TSO where each driver will
> want to do things a little differently. Alot of it has to do with the
> fact that this is kind of a nasty hack that we are trying to add since
> many devices won't like the fact that we are lying about the size of
> our actual L2 header so things like VLAN tag insertion are going to
> end up blowing back on us.
>
Right, the point is that we're trying to get out of the model where
every driver/device implements TSO differently, supports ad hoc
protocols, etc. Do you see any other common invasive technique that we
need to deal with other than VLAN insertion and IP ID?
> Really my preference in the case of ixgbe would have been to let the
> hardware update the outer IP header and the inner TCP header and then
> do the UDP and inner IP header as the static headers. That way we
> could still theoretically support fragmentation on the outer headers
> which last I knew is a very real possibility since the DF bit is not
> set on the outer headers for VXLAN I believe.
>
> - Alex
Powered by blists - more mailing lists