[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pl6h2a2l.fsf@nvidia.com>
Date: Fri, 6 Feb 2026 14:44:34 +0100
From: Petr Machata <petrm@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: <davem@...emloft.net>, <netdev@...r.kernel.org>, <edumazet@...gle.com>,
<pabeni@...hat.com>, <andrew+netdev@...n.ch>, <horms@...nel.org>,
<shuah@...nel.org>, <willemb@...gle.com>, <petrm@...dia.com>,
<donald.hunter@...il.com>, <michael.chan@...adcom.com>,
<pavan.chebbi@...adcom.com>, <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH net-next 3/9] tools: ynltool: add qstats analysis for
HW-GRO efficiency / savings
Jakub Kicinski <kuba@...nel.org> writes:
> Extend ynltool to compute HW GRO savings metric - how many
> packets has HW GRO been able to save the kernel from seeing.
>
> Note that this definition does not actually take into account
> whether the segments were or weren't eligible for HW GRO.
> If a machine is receiving all-UDP traffic - new metric will show
> HW-GRO savings of 0%. Conversely since the super-packet still
> counts as a received packet, savings of 100% is not achievable.
> Perfect HW-GRO on a machine with 4k MTU and 64kB super-frames
> would show ~93.75% savings. With 1.5k MTU we may see up to
> ~97.8% savings (if my math is right).
>
> Example after 10 sec of iperf on a freshly booted machine
> with 1.5k MTU:
>
> $ ynltool qstats show
> eth0 rx-packets: 40681280 rx-bytes: 61575208437
> rx-alloc-fail: 0 rx-hw-gro-packets: 1225133
> rx-hw-gro-wire-packets: 40656633
> $ ynltool qstats hw-gro
> eth0: 96.9% savings
>
> None of the NICs I have access to can report "missed" HW-GRO
> opportunities so computing a true "effectiveness" metric
> is not possible. One could also argue that effectiveness metric
> is inferior in environments where we control both senders and
> receivers, the savings metrics will capture both regressions
> in receiver's HW GRO effectiveness but also regressions in senders
> sending smaller TSO trains. And we care about both. The main
> downside is that it's hard to tell at a glance how well the NIC
> is doing because the savings will be dependent on traffic patterns.
>
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> tools/net/ynl/ynltool/qstats.c | 75 +++++++++++++++++++++++++++++++---
> 1 file changed, 70 insertions(+), 5 deletions(-)
>
> diff --git a/tools/net/ynl/ynltool/qstats.c b/tools/net/ynl/ynltool/qstats.c
> index d19acab0bf2a..e5b83cf9bf3b 100644
> --- a/tools/net/ynl/ynltool/qstats.c
> +++ b/tools/net/ynl/ynltool/qstats.c
Since I see there's going to be a v2, a nit:
> @@ -580,6 +638,7 @@ static int do_help(int argc __attribute__((unused)),
> "Usage: %s qstats { COMMAND | help }\n"
> " %s qstats [ show ] [ OPTIONS ]\n"
> " %s qstats balance\n"
> + " %s qstats hw-gro\n"
> "\n"
> " OPTIONS := { scope queue | group-by { device | queue } }\n"
> "\n"
I think at this point it would make sense to convert to %1$s throughout
instead of pumping in more arguments.
> @@ -588,17 +647,23 @@ static int do_help(int argc __attribute__((unused)),
> " show scope queue - Display per-queue statistics\n"
> " show group-by device - Display device-aggregated statistics (default)\n"
> " show group-by queue - Display per-queue statistics\n"
> - " balance - Analyze traffic distribution balance.\n"
> + "\n"
> + " Analysis:\n"
> + " balance - Traffic distribution between queues.\n"
> + " hw-gro - HW GRO effectiveness analysis\n"
> + " - savings - delta between packets received\n"
> + " on the wire and packets seen by the kernel.\n"
> "",
> - bin_name, bin_name, bin_name);
> + bin_name, bin_name, bin_name, bin_name);
>
> return 0;
> }
Powered by blists - more mailing lists