[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACKFLikTA-pA8ewo6y_psZeSyXkmmCDGJv5KKizUVPEaz_js8g@mail.gmail.com>
Date: Sun, 10 Dec 2017 22:39:33 -0800
From: Michael Chan <michael.chan@...adcom.com>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: David Miller <davem@...emloft.net>,
Netdev <netdev@...r.kernel.org>,
Andrew Gospodarek <andrew.gospodarek@...adcom.com>,
Ariel Elior <Ariel.Elior@...ium.com>,
everest-linux-l2@...ium.com
Subject: Re: [PATCH net-next v3 1/5] net: Introduce NETIF_F_GRO_HW.
On Sun, Dec 10, 2017 at 9:02 AM, Alexander Duyck
<alexander.duyck@...il.com> wrote:
> On Sat, Dec 9, 2017 at 10:40 PM, Michael Chan <michael.chan@...adcom.com> wrote:
>> It is possible that if you have incoming packets 1, 2, 3, 4, 5 for a
>> TCP connection, HW_GRO can aggregate packets 1, 2, 3, but cannot
>> aggregate packets 4 and 5 due to hardware resource limitation.
>> Software GRO aggregates 4 and 5. So it works well together.
>
> Right. But in the case where 1, 2, 3, 4, and 5 were not aggregated by
> hardware GRO because the frames could not be aggregated and then GRO
> burns cycles coming to the same conclusion you have waste. Same thing
> goes for if hardware GRO aggregates 1 through 5 and then SW GRO tries
> to see if it can do more.
>
> They are both doing the same thing, but what I see it as is two
> passes, not something where they are working together. The hardware
> GRO can rely on software GRO for a second pass, but it doesn't need
> to. The fact that it doesn't need to tells me that it isn't a hard
> requirement to have GRO in order to make use of software GRO.
I guess I look at this as feature propagation from net device to
hardware. To me, it makes a lot of sense.
We've been doing hardware GRO for a while and I never think of it as
replacement for software GRO. It's hardware accelerated GRO for a
subset of the connections that hardware can handle.
As for the additional GRO pass in software, I think it is quite
efficient. When hardware has aggregated a GRO frame, software GRO
will effectively "flush" it and never hold it for more aggregation.
After this patchset is done, I can look at the code and see if we can
further optimize the "2nd pass" code path when hardware has already
aggregated the packet.
Anyway, I will move the GRO_HW/GRO dependency to the
ndo_fix_features() of the 3 drivers, so we can move on and get these
patches accepted.
>> May be you meant put the RXCSUM check in the outer if statement so
>> that someone could add more inner checks? OK, I think that's what you
>> meant.
>
> Yes that is what I meant.
OK. Will change in v4.
>> We need patch #2 otherwise generic GRO won't work on these 3 drivers.
>> I don't think I fully understand your concerns about propagation. To
>> me propagation is just a usage model where an upper device will
>> control the common features of lower devices. It is more convenient
>> to have propagation, but requires upper devices to be aware of all
>> features that propagate (GRO, RXCSUM). Without propagation, it is
>> still fine.
>
> I'm not sure if it is. It depends on how much XDP depends on the
> frames being non-linear. As-is I am pretty sure this doesn't work for
> the stacked case anyway since GRO was still enabled for lower devices
> anyway. So you might look at just modifying patch 2 to not worry about
> the stacked devices case since I think that was already broken with
> GRO anyway.
OK. I don't think anyone will run generic XDP on an upper device anyway.
Powered by blists - more mailing lists