lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UfsUfVm0YBOrZKnqdqKpRZnt5KSeDs_rJ7Z=7kwS0rh7w@mail.gmail.com>
Date:   Thu, 7 Dec 2017 18:36:54 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Michael Chan <michael.chan@...adcom.com>
Cc:     David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        Andrew Gospodarek <andrew.gospodarek@...adcom.com>,
        Ariel Elior <Ariel.Elior@...ium.com>,
        everest-linux-l2@...ium.com
Subject: Re: [PATCH net-next v2 1/5] net: Introduce NETIF_F_GRO_HW.

On Thu, Dec 7, 2017 at 4:05 PM, Michael Chan <michael.chan@...adcom.com> wrote:
> On Thu, Dec 7, 2017 at 3:35 PM, Alexander Duyck
> <alexander.duyck@...il.com> wrote:
>> On Thu, Dec 7, 2017 at 3:17 PM, Michael Chan <michael.chan@...adcom.com> wrote:
>>> I don't get this.  I don't see how TSO is related.
>>
>> It isn't. That is the point. If I change ANY feature it will trigger
>> fix_features to get called. It will then see we have GRO_HW and LRO
>> since we have one interface that supports one and one that supports
>> another. It will trigger this code and cause LRO to be disabled which
>> then will be propagated down.
>
> I see.  But this won't happen.  Because the bonding driver is not
> advertising NETIF_F_GRO_HW in its hw_features.  It is not advertising
> NETIF_F_GRO either but I think it gets added automatically since it is
> a software feature.  So LRO won't get disabled on the bond when a
> unrelated feature is changed.
>
> But I think I see your point.  I can make it so that it is up to
> individual driver's .ndo_fix_features() to drop LRO/GRO_HW as it sees
> fit, instead of doing it in the common netdev_fix_features().  That
> way, it is more flexible at least.

Thank you.

>>
>>>> That
>>>> is a bad behavior and shouldn't happen. In addition I would want to be
>>>> able to disable GRO_HW from the bond should I encounter a buggy
>>>> implementation of the GRO hardware offload at some point in the
>>>> future. I don't like it being tied so closely with GRO since all it
>>>> takes is one hardware failure and then we are dealing with people
>>>> disabling GRO due to reports of it being buggy instead of it being
>>>> tied specifically to GRO_HW.
>>>>
>>>
>>> Today, GRO, RXCSUM, NTUPLE, etc on a bond do not get propagated.  You
>>> can propose and make them propagate if you want.
>>
>> Or you can fix your patches so you take it into account now instead of
>> putting things in a broken state and asking others to fix it.
>
> I don't think that things are necessarily broken today.  LRO truly
> needs to be propagated.  It's debatable whether other features like
> GRO/RXCSUM/NTUPLE should be centrally set by the upper device or not.

So I can agree with the NTUPLE not being propagated since it doesn't
actually effect upper devices. Really the functionality only really
has effects locally since the functionality consists of route to a
specific queue/device or drop the packet.

I'm not sure why RXCSUM isn't being propagated. It seems like that is
something that would make sense to have passed all the way down to the
lower devices since a single device that is doing bad Rx checksum
offloads could potentially corrupt all traffic in a bond. Seems like
that one should definitely be included.

>>
>>> So GRO_HW just naturally follows GRO since it is a true subset of it.
>>
>> Right, and I am saying you have exposed a bug. If I don't want GRO on
>> the bond it should be propagated down. It doesn't make sense to allow
>> lower devices to assemble frames when the bond doesn't want them and
>> GSO won't kick in before it gets to the bond.
>
> GRO kicks in at the lower device before it gets to the bond if the
> lower device calls napi_gro_receive() and GRO is enabled.

I get that. I assume the reason why the bond doesn't have it enabled
is because we don't want it to kick in at every given netdev, there
isn't any point to do GRO more than once. The problem is GRO_HW isn't
a pure software offload like GRO is. Call me a pessimist, but when we
end up encountering a buggy implementation that has to be disabled we
will want the right infrastructure in place to handle it. It becomes
another argument for why we might want to split GRO_HW and GRO without
tying them together. It would make sense to expose GRO_HW in a bond,
but not GRO. It might be something where we want to do any close tying
together of the GRO flag and GRO_HW at the driver as well. Basically
the legacy devices that transition over to GRO_HW from using just the
GRO flag could do that to maintain existing functionality, and new
drivers that implement it could opt in to the same behavior or just
handle GRO_HW as a separate flag.

Actually I just had a thought. What if we consider this a separate GRO
stage instead of just a hardware offload? Our standard GRO is a post
receive from the driver perspective, basically the packet is assembled
after we have handed it to the stack. What you are doing with GRO_HW
is essentially providing an early reassembly before it is handed to
the stack. What if we were to rename GRO_HW to something like
GRO_LOWER, GRO_EARLY, GRO_PRE, or pick your name (I'm lousy at
naming), and used it as a way to indicate that we want to perform GRO
before we begin receive processing on the frame in our driver? Then
for stacked devices you could use this new flag to indicate you don't
want to perform GRO on the lower levels below this device, and could
then use the regular GRO flag to control if we do it ourselves. Doing
that should provide stacked devices with a good way to control GRO on
the lower devices and would resolve what you need to indicate as well.
The only real changes needed might be a rename and to add the
necessary bit shifting for the upper and lower dev sync code. If you
aren't interested in the idea I can probably spend a couple of hours
getting to it tomorrow since I think this might be a much better way
to go as it solves multiple issues.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ