[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57bf675d-c2f0-4022-845c-166891e336be@gmail.com>
Date: Tue, 26 Mar 2024 18:25:02 +0100
From: Richard Gobert <richardbgobert@...il.com>
To: Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>
Cc: davem@...emloft.net, kuba@...nel.org, willemdebruijn.kernel@...il.com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net-next v4 4/4] net: gro: move L3 flush checks to
tcp_gro_receive
Paolo Abeni wrote:
> Hi,
>
> On Tue, 2024-03-26 at 16:02 +0100, Richard Gobert wrote:
>> This patch is meaningful by itself - removing checks against non-relevant
>> packets and making the flush/flush_id checks in a single place.
>
> I'm personally not sure this patch is a win. The code churn is
> significant. I understand this is for performance's sake, but I don't
> see the benefit???
>
Could you clarify what do you mean by code churn?
This patch removes all use of p->flush and flush_id from the
CB. The entire logic for L3 flush_id is scattered in tcp_gro_receive
and {inet,ipv6}_gro_receive with conditionals rewriting ->flush,
->flush_id and ->is_atomic. Moving it to one place (gro_network_flush)
should be more readable. (Personally, it took me a lot of time to
understand the current logic of flush + flush_id + is_atomic)
> The changelog shows that perf reports slightly lower figures for
> inet_gro_receive(). That is expected, as this patch move code out of
> such functio. What about inet_gro_flush()/tcp_gro_receive() where such
> code is moved?
>
Please consider the following 2 common scenarios:
1) Multiple packets in the GRO bucket - the common case with multiple
packets in the bucket (i.e. running super_netperf TCP_STREAM) - each layer
executes a for loop - going over each packet in the bucket. Specifically,
L3 gro_receive loops over the bucket making flush,flush_id,is_atomic
checks. For most packets in the bucket, these checks are not
relevant. (possibly also dirtying cache lines with non-relevant p
packets). Removing code in the for loop for this case is significant.
2) UDP/TCP streams which do not coalesce in GRO. This is the common case
for regular UDP connections (i.e. running netperf UDP_STREAM). In this
case, GRO is just overhead. Removing any code from these layers
is good (shown in the first measurement of the commit message).
In the case of a single TCP connection - the amount of checks should be
the same overall not causing any noticeable difference.
> Additionally the reported deltas is within noise level according to my
> personal experience with similar tests.
>
I've tested the difference between net-next and this patch repetitively,
which showed stable results each time. Is there any specific test you
think would be helpful to show the result?
Thanks
Powered by blists - more mailing lists