lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjwtoH1K1o0F5k+N@ubuntu>
Date: Thu, 9 May 2024 01:57:52 +0000
From: Joe Damato <jdamato@...tly.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Tariq Toukan <ttoukan.linux@...il.com>,
	Zhu Yanjun <zyjzyj2000@...il.com>, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org, saeedm@...dia.com, gal@...dia.com,
	nalramli@...tly.com, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Leon Romanovsky <leon@...nel.org>,
	"open list:MELLANOX MLX5 core VPI driver" <linux-rdma@...r.kernel.org>,
	Paolo Abeni <pabeni@...hat.com>, Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats

On Wed, May 08, 2024 at 05:56:38PM -0700, Jakub Kicinski wrote:
> On Wed, 8 May 2024 16:24:08 -0700 Joe Damato wrote:
> > > A possible reason for this difference is the queues included in the sum.
> > > Our stats are persistent across configuration changes, so they doesn't reset
> > > when number of channels changes for example.
> > > 
> > > We keep stats entries for al ring indices that ever existed. Our driver
> > > loops and sums up the stats for all of them, while the stack loops only up
> > > to the current netdev->real_num_rx_queues.
> > > 
> > > Can this explain the diff here?  
> > 
> > Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
> > script to adjust the queue count shortly after booting.
> > 
> > I disabled that and re-ran:
> > 
> >   NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> > 
> > and all tests pass.
> 
> Stating the obvious, perhaps, but in this case we should add the stats
> from inactive queues to the base (which when the NIC is down means all
> queues).

If I'm following that right and understanding mlx5 (two things I am
unlikely to do simultaneously), that sounds to me like:

- mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
  priv->channels.params.num_channels (instead of priv->stats_nch), and when
  summing mlx5e_sq_stats in the latter function, it's up to
  priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.

- mlx5e_get_base_stats accumulates and outputs stats for everything from
  priv->channels.params.num_channels to priv->stats_nch, and
  priv->channels.params.mqprio.num_tc to priv->max_opened_tc... which
  should cover the inactive queues, I think.

Just writing that all out to avoid hacking up the wrong thing for the v2
and to reduce overall noise on the list :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ