lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UdKcVb+Qpmbdv=TfFTdnO9PGDgcTXpXf+eJ_y9k4wz_CA@mail.gmail.com>
Date:   Sat, 9 Dec 2017 14:04:47 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Michael Chan <michael.chan@...adcom.com>
Cc:     David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        Andrew Gospodarek <andrew.gospodarek@...adcom.com>,
        Ariel Elior <Ariel.Elior@...ium.com>,
        everest-linux-l2@...ium.com
Subject: Re: [PATCH net-next v3 1/5] net: Introduce NETIF_F_GRO_HW.

On Sat, Dec 9, 2017 at 1:31 PM, Michael Chan <michael.chan@...adcom.com> wrote:
> On Sat, Dec 9, 2017 at 10:50 AM, Alexander Duyck
> <alexander.duyck@...il.com> wrote:
>> On Fri, Dec 8, 2017 at 10:27 PM, Michael Chan <michael.chan@...adcom.com> wrote:
>>> Introduce NETIF_F_GRO_HW feature flag for NICs that support hardware
>>> GRO.  With this flag, we can now independently turn on or off hardware
>>> GRO when GRO is on.  Previously, drivers were using NETIF_F_GRO to
>>> control hardware GRO and so it cannot be independently turned on or
>>> off without affecting GRO.
>>>
>>> Hardware GRO (just like GRO) guarantees that packets can be re-segmented
>>> by TSO/GSO to reconstruct the original packet stream.  It is a subset of
>>> NETIF_F_GRO and depends on it, as well as NETIF_F_RXCSUM.
>>
>> So I would disagree with it being a subset of NETIF_F_GRO. If anything
>> it is an alternative to NETIF_F_GRO. It is performing GRO much earlier
>> at the device level in the case of hardware drivers. My concern is
>> this is probably going to end up applying to things other than just
>> hardware drivers though. For example what is to prevent this from
>> being applied to something like a virtio/tap interface? It seems like
>> this should be something that would be easy to implement in software.
>
> If you do it in software, it's called NETIF_F_GRO.  We already have
> it.  The whole point of the new flag is that if the device has
> software GRO enabled, and if the device supports GRO_HW, then we can
> do a subset of GRO in hardware (hopefully faster).

I can see what you are getting at. But GRO_HW with GRO stacked on top
of it won't necessarily be the fastest form of GRO. If you have a
GRO_HW implementation that is complete enough people may want to
disable Software GRO in order to avoid the extra overhead involved
with using it.

>> In addition as I said in my earlier comments I think we should
>> probably look at using this new feature bit to indicate that we allow
>> GRO to occur at or below this device as opposed to just above it as
>> currently occurs with conventional GRO.
>>
>>> Since NETIF_F_GRO is not propagated between upper and lower devices,
>>> NETIF_F_GRO_HW should follow suit since it is a subset of GRO.  In other
>>> words, a lower device can independent have GRO/GRO_HW enabled or disabled
>>> and no feature propagation is required.  This will preserve the current
>>> GRO behavior.
>>
>> I'm going to back off on my requirement for you to handle propagation
>> since after spending a couple hours working on it I did find it was
>> more complex then I originally thought it would be. With that said
>> however I would want to see this feature implemented in such a way
>> that we can deal with propagating the bits in the future if we need to
>> and that is what I am basing my comments on.
>
> Nothing stops anyone from propagating the flag.  Just add
> NETIF_F_GRO_HW to NETIF_F_UPPER_DISABLES and it will be propagated
> just like LRO.

Yes, but the problem then is it doesn't solve the secondary issue of
no way to propagate down the desire to disable GRO as well. That is
why I am thinking that the new bit could be used to indicate that we
want GRO to be supported either in the driver or below it instead of
only in "hardware". We are much better off with a generic solution and
that is why I think it might be better to use more of a pipeline or
staged type definition for this. Basically with GRO it occurs in the
GRO logic just after the driver hands off the packet, while this new
bit indicates that GRO happens somewhere before then. If we use that
definition for this then it becomes usable to deal with things like
the stacked devices problem where the stacked devices normally have
the GRO flag disabled since we don't want to run GRO multiple times,
but as a result the stacked devices have no way of saying they don't
want GRO. If we tweak the definition of this bit it solves that
problem since it would allow for us disabling GRO, GRO_HW, and LRO on
any devices below a given device.

>>> @@ -7424,6 +7424,18 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
>>>                 features &= ~dev->gso_partial_features;
>>>         }
>>>
>>> +       if (features & NETIF_F_GRO_HW) {
>>> +               /* Hardware GRO depends on GRO and RXCSUM. */
>>> +               if (!(features & NETIF_F_GRO)) {
>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no GRO feature.\n");
>>> +                       features &= ~NETIF_F_GRO_HW;
>>> +               }
>>
>> I still disagree with this bit. I think GRO is a pure software
>> offload, whereas GRO_HW can represent either a software offload of
>> some sort occurring in or before the driver, or in the hardware.
>> Basically the difference between the two as I view it is where the GRO
>> is occurring. I would like to keep that distinction and make use of
>> it. As I mentioned before in the case of bonding we currently have no
>> way to disable GRO on the lower devices partially because GRO is a
>> pure software feature and always happens at each device along the way.
>> The nice thing about this new bit is the assumption is that it is
>> pushing GRO to the lowest possible level and not triggering any side
>> effects like GRO currently does. I hope to use that logic with stacked
>> devices so that we could clear the bit and have it disable GRO,
>> GRO_HW, and LRO on all devices below the device that cleared it.
>>
>> I think this linking of GRO and GRO_HW is something that would be
>> better served by moving it into the driver if you are wanting to
>> maintain the behavior of how this was previously linked to GRO.
>
> If you insist, I can move this to the driver's ndo_fix_features().
> But I feel it is much better to enforce this dependency system wide.
> Once again, GRO_HW is hardware accelerated GRO and should depend on
> it.

The question I would have is why? Where is the dependency? I don't see
it. It is GRO in one spot and/or GRO in the other. The two don't
interract directly and I don't believe you can do software GRO on a
frame that has already been coalesced in hardware, and you take a
performance penalty for trying to offload in software what you would
have already been handled in hardware.

Also, when we start propagating this up to indicate it is active we
don't want to have the GRO dependency since it would just make things
more expensive since we only need to do GRO in software once.

>> It
>> also makes it so that it is much easier to compare the performance for
>> GRO_HW against just a pure software GRO since you could then enable
>> them independently. Software GRO can come at a cost, and leaving it
>> enabled when you want to do it all in hardware is just adding a
>> penalty of sorts since I know for many of my routing tests I normally
>> disable GRO as it has a significant per-packet cost for small packet
>> workloads.
>>
>>> +               if (!(features & NETIF_F_RXCSUM)) {
>>> +                       netdev_dbg(dev, "Dropping NETIF_F_GSO_HW since no RXCSUM feature.\n");
>>> +                       features &= ~NETIF_F_GRO_HW;
>>> +               }
>>
>> So I was thinking about this. For LRO it makes sense to disable it in
>> the case of RXCSUM being disabled since most implementations leave the
>> Rx checksum mangled. However for GRO I am not sure it makes complete
>> sense. For software GRO we perform checksum validation in either
>> tcp4_gro_receive or tcp6_gro_receive. Why should the hardware
>> implementation behave differently? When a GRO frame is assembled the
>> checksum is converted to CHECKSUM_PARTIAL anyway even if Rx checksum
>> validation is disabled for the driver.
>
> This is a logical feature dependency that Yuval Mintz suggested.  For
> GRO_HW to work, hardware must verify the checksum of a packet before
> the packet can be merged.
>
> So if the user does not want to do RXCSUM on this device for whatever
> reason, it logically means that he also doesn't want to do GRO_HW with
> implied RXCSUM performed on each packet that is merged.
>
> So I agree with Yuval that this dependency makes sense.

Okay then, so if we are going to go that route we may want to be
complete on this and just disable GRO_HW and LRO if RXCSUM is not
enabled. We might also want to add a comment indicating that we don't
support anything that might mangle a packet at the driver level if
RXCSUM is not enabled. Comments explaining all this would be a good
thing just to keep someone from grabbing GRO and lumping it in at some
point in the future.

I'm still working on trying to propagate the Rx checksum properly
since it should probably follow the same UPPER_DISABLES behavior as
LRO, but I will probably only have a few hours over the next week to
really work on any code and there end up being a number of stacked
drivers that have to be updated. I would be good with just flipping
this logic for now and if RXCSUM is not set, and GRO_HW (just noticed
the typo in your message) is set, then print your message and clear
the bit. I can probably come back later and add LRO once I get the
propagation bits worked out.

As far as patch 2 in the set it would probably be better to either
drop it and just accept it as an outstanding issue, or you could take
on the propagation problems with GRO_HW and RXCSUM since we really
need to get those solved in order for this functionality to fully
work.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ