lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Uf-mPK3xQuR+BdB5owfuJdQkZPE6Z0MQs9R-y3NRrC_9g@mail.gmail.com>
Date:	Mon, 28 Mar 2016 12:31:25 -0700
From:	Alexander Duyck <alexander.duyck@...il.com>
To:	Tom Herbert <tom@...bertland.com>
Cc:	Jesse Gross <jesse@...nel.org>, David Miller <davem@...emloft.net>,
	Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net v2 2/3] tunnels: Don't apply GRO to multiple layers of encapsulation.

On Mon, Mar 28, 2016 at 11:47 AM, Tom Herbert <tom@...bertland.com> wrote:
> On Mon, Mar 28, 2016 at 10:37 AM, Alexander Duyck
> <alexander.duyck@...il.com> wrote:
>> On Mon, Mar 28, 2016 at 9:31 AM, Tom Herbert <tom@...bertland.com> wrote:
>>> On Sun, Mar 27, 2016 at 9:38 PM, Jesse Gross <jesse@...nel.org> wrote:
>>>> On Sat, Mar 26, 2016 at 12:41 PM, Tom Herbert <tom@...bertland.com> wrote:
>>>>> On Sat, Mar 19, 2016 at 9:32 AM, Jesse Gross <jesse@...nel.org> wrote:
>>>>>> When drivers express support for TSO of encapsulated packets, they
>>>>>> only mean that they can do it for one layer of encapsulation.
>>>>>> Supporting additional levels would mean updating, at a minimum,
>>>>>> more IP length fields and they are unaware of this.
>>>>>>
>>>>> This patch completely breaks GRO for FOU and GUE.
>>>>>
>>>>>> No encapsulation device expresses support for handling offloaded
>>>>>> encapsulated packets, so we won't generate these types of frames
>>>>>> in the transmit path. However, GRO doesn't have a check for
>>>>>> multiple levels of encapsulation and will attempt to build them.
>>>>>>
>>>>>> UDP tunnel GRO actually does prevent this situation but it only
>>>>>> handles multiple UDP tunnels stacked on top of each other. This
>>>>>> generalizes that solution to prevent any kind of tunnel stacking
>>>>>> that would cause problems.
>>>>>>
>>>>> GUE and FOU regularly create packets that will be both GSO UDP tunnels
>>>>> and IPIP, SIT, GRE, etc. This is by design. There should be no
>>>>> ambiguity in the drivers as to what this means. For instance, if
>>>>> SKB_GSO_UDP_TUNNEL and SKB_GSO_GRE are set that is GRE/UDP packet, a
>>>>> driver can use ndo_features_check to validate.
>>>>>
>>>>> So multiple levels of encapsulation with GRO is perfectly valid and I
>>>>> would suggest to simply revert this patch. The one potential issue we
>>>>> could have is the potential for GRO to construct a packet which is a
>>>>> UDP-encapsulation inside another encapsulation, like UDP-encap in GRE.
>>>>> In this case the GSO flags don't provide enough information to
>>>>> distinguish say between GRE/UDP (common case) and UDP/GRE (uncommon
>>>>> case). To make this clear we can set udp_mark in GRE, ipip, and sit
>>>>> but *not* check for it.
>>>>
>>>> Generally speaking, multiple levels of encapsulation offload are not
>>>> valid. I think this is pretty clear from the fact that none of the
>>>> encapsulation drivers expose support for encapsulation offloads on
>>>> transmit and the drivers supporting NETIF_F_GSO_GRE and
>>>> NETIF_F_GSO_UDP_TUNNEL do not mean they can handle GRE in VXLAN.
>>>>
>>>
>>> Kernel software offload does support this combination just fine.
>>> Seriously, I've tested that more than a thousand times. This is a few
>>> HW implementations you're referring to. The limitations of these
>>> drivers should not dictate how we build the software-- it needs to
>>> work the other way around.
>>
>> Jesse does have a point.  Drivers aren't checking for this kind of
>> thing currently as the transmit side doesn't generate these kind of
>> frames.  The fact that GRO is makes things a bit more messy as we will
>> really need to restructure the GSO code in order to handle it.  As far
>> as drivers testing for it I am pretty certain the i40e isn't.  I would
>> wonder if we need to add yet another GSO bit to indicate that we
>> support multiple layers of encapsulation.  I'm pretty sure the only
>> way we could possibly handle it would be in software since what you
>> are indicating is a indeterminate number of headers that all require
>> updates.
>>
>>>> Asking drivers to assume that this combination of flags means FOU
>>>> doesn't seem right to me. To the best of my knowledge, no driver uses
>>>> ndo_feature_check to do validation of multiple tunnel offload flags
>>>> since the assumption is that the stack will never try to do this.
>>>> Since FOU is being treated as only a single level of encapsulation, I
>>>> think it would be better to just advertise it that way on transmit
>>>> (i.e. only set SKB_GSO_UDP_TUNNEL).
>>>>
>>> If it's not FOU it will be something else. Arbitrarily declaring
>>> multi-levels of encapsulation invalid is simply the wrong direction,
>>> we should be increasing generality and capabilities of the kernel not
>>> holding it down with artificial limits. This is why the generic TSO
>>> work that Alexander and Edward are looking at is so important-- if
>>> this flies then we can offload any combination of encapsulations with
>>> out protocol specific information.
>>
>> The segmentation code isn't designed to handle more than 2 layers of
>> headers.  Currently we have the pointers for the inner headers and the
>> outer headers.  If you are talking about adding yet another level we
>> would need additional pointers in the skbuff to handle them and we
>> currently don't have them at present.
>>
>>>> The change that you are suggesting would result in packets generated
>>>> by GRO that cannot be handled properly on transmit in some cases and
>>>> would likely end up being dropped or malformed. GRO is just an
>>>> optimization and correctness must come first so we cannot use it if it
>>>> might corrupt traffic.
>>>
>>> That's (a few) drivers problem. It's no different than if they had
>>> decided that SKB_GSO_UDP_TUNNEL means vxlan, or they can support GRE
>>> in IPv4 offload but not GRE in IPv6, or they can only handle headers
>>> of a certain size, can't handle IPv6 ext. hdrs., etc. As I mentioned,
>>> the long term solution is to eliminate the GSO_ flags and use a
>>> generic segmentation offload interface. But in the interim, it is
>>> *incumbent* on drivers to determine if they can handle a GSO packet
>>> and the interfaces to do that exist. Restricting the capabilities of
>>> GRO just to appease those drivers is not right. Breaking existing GRO
>>> for their benefit is definitely not right.
>>
>> This isn't about if drivers can handle it.  It is about if the skbuff
>> can handle it.  The problem as it stands right now is that we start
>> losing data once we go past 1 level of encapsulation.  We only have
>> the standard mac_header, network_header, transport_header, and then
>> the inner set of the same pointers.  If we try to go multiple levels
>> deep we start losing data.
>>
> Huh? GUE/FOU has been running perfectly well with two levels of
> encapsulation for over a year now. We never had to add anything to
> skbuff to make that work. If "losing data" is a problem please provide
> the *reproducible* test case for that and we'll debug that.

I'm guessing most of your examples involve either a remote checksum
being enabled or using NICs that don't support any tunnel offloads?
Hardware needs to be able to identify where headers are in order to
perform most of their offloads for TSO.  We either have to parse to
find them or we are provided with them by the stack.  GSO can work
around it as long as we don't stack checksum based offload with
non-checksum of the same type.

mostIf for example you were to try and send a frame that had either an
inner or outer GRE tunnel in addition to a UDP tunnel most NICs would
probably screw it up.  Up until now that hasn't been an issue.  As we
start turning on offload support for multiple tunnel types thanks to
the partial offloads it will come back and bite us if we try to tell
hardware to handle more than 2 levels of headers.  I'm thinking if
nothing else we might have to add yet another bit to GSO for stacked
tunnels which can probably only be supported via partial GSO and only
if we can get away with just replicating headers.

It looks like we need to go through and probably clean up both the GSO
and GRO code.  First we have to get the GSO code setup so that it can
fully handle multiple levels of tunnels.  The code as it is now
assumes you would only have one level and we are configuring the
headers as such.  In addition the GRO code doesn't seem to place the
header offsets correctly.  For instance, from what I can tell it looks
like the inner transport offset is never updated.  We will probably
need to have pointers for the inner-most and outer-most set of headers
and from there we can start handling the offloads.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ