lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c3f4f1a4-303d-4d57-ae83-ed52e5a08f69@linux.dev>
Date: Fri, 3 May 2024 12:55:41 +0200
From: Zhu Yanjun <zyjzyj2000@...il.com>
To: Joe Damato <jdamato@...tly.com>, linux-kernel@...r.kernel.org,
 netdev@...r.kernel.org, tariqt@...dia.com, saeedm@...dia.com
Cc: gal@...dia.com, nalramli@...tly.com, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Leon Romanovsky <leon@...nel.org>,
 "open list:MELLANOX MLX5 core VPI driver" <linux-rdma@...r.kernel.org>,
 Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats

On 03.05.24 04:25, Joe Damato wrote:
> Hi:
> 
> This is only 1 patch, so I know a cover letter isn't necessary, but it
> seems there are a few things to mention.
> 
> This change adds support for the per queue netdev-genl API to mlx5,
> which seems to output stats:
> 
> ./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
>           --dump qstats-get --json '{"scope": "queue"}'
> 
> ...snip
>   {'ifindex': 7,
>    'queue-id': 28,
>    'queue-type': 'tx',
>    'tx-bytes': 399462,
>    'tx-packets': 3311},
> ...snip

Ethtool -S ethx can get the above information
"
..
      tx-0.packets: 2094
      tx-0.bytes: 294141
      rx-0.packets: 2200
      rx-0.bytes: 267673
..
"

> 
> I've tried to use the tooling suggested to verify that the per queue
> stats match the rtnl stats by doing this:
> 
>    NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> 
> And the tool outputs that there is a failure:
> 
>    # Exception| Exception: Qstats are lower, fetched later
>    not ok 3 stats.pkt_byte_sum

With ethtool, does the above problem still occur?

Zhu Yanjun

> 
> The other tests all pass (including stats.qstat_by_ifindex).
> 
> This appears to mean that the netdev-genl queue stats have lower numbers
> than the rtnl stats even though the rtnl stats are fetched first. I
> added some debugging and found that both rx and tx bytes and packets are
> slightly lower.
> 
> The only explanations I can think of for this are:
> 
> 1. tx_ptp_opened and rx_ptp_opened are both true, in which case
>     mlx5e_fold_sw_stats64 adds bytes and packets to the rtnl struct and
>     might account for the difference. I skip this case in my
>     implementation, so that could certainly explain it.
> 2. Maybe I'm just misunderstanding how stats aggregation works in mlx5,
>     and that's why the numbers are slightly off?
> 
> It appears that the driver uses a workqueue to queue stats updates which
> happen periodically.
> 
>   0. the driver occasionally calls queue_work on the update_stats_work
>      workqueue.
>   1. This eventually calls MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw),
>      in drivers/net/ethernet/mellanox/mlx5/core/en_stats.c, which appears
>      to begin by first memsetting the internal stats struct where stats are
>      aggregated to zero. This would mean, I think, the get_base_stats
>      netdev-genl API implementation that I have is correct: simply set
>      everything to 0.... otherwise we'd end up double counting in the
>      netdev-genl RX and TX handlers.
>   2. Next, each of the stats helpers are called to collect stats into the
>      freshly 0'd internal struct (for example:
>      mlx5e_stats_grp_sw_update_stats_rq_stats).
> 
> That seems to be how stats are aggregated, which would suggest that if I
> simply .... do what I'm doing in this change the numbers should line up.
> 
> But they don't and its either because of PTP or because I am
> misunderstanding/doing something wrong.
> 
> Maybe the MLNX folks can suggest a hint?
> 
> Thanks,
> Joe
> 
> Joe Damato (1):
>    net/mlx5e: Add per queue netdev-genl stats
> 
>   .../net/ethernet/mellanox/mlx5/core/en_main.c | 68 +++++++++++++++++++
>   1 file changed, 68 insertions(+)
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ