[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjwtoH1K1o0F5k+N@ubuntu>
Date: Thu, 9 May 2024 01:57:52 +0000
From: Joe Damato <jdamato@...tly.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Tariq Toukan <ttoukan.linux@...il.com>,
Zhu Yanjun <zyjzyj2000@...il.com>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, saeedm@...dia.com, gal@...dia.com,
nalramli@...tly.com, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Leon Romanovsky <leon@...nel.org>,
"open list:MELLANOX MLX5 core VPI driver" <linux-rdma@...r.kernel.org>,
Paolo Abeni <pabeni@...hat.com>, Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
On Wed, May 08, 2024 at 05:56:38PM -0700, Jakub Kicinski wrote:
> On Wed, 8 May 2024 16:24:08 -0700 Joe Damato wrote:
> > > A possible reason for this difference is the queues included in the sum.
> > > Our stats are persistent across configuration changes, so they doesn't reset
> > > when number of channels changes for example.
> > >
> > > We keep stats entries for al ring indices that ever existed. Our driver
> > > loops and sums up the stats for all of them, while the stack loops only up
> > > to the current netdev->real_num_rx_queues.
> > >
> > > Can this explain the diff here?
> >
> > Yes, that was it. Sorry I didn't realize this case. My lab machine runs a
> > script to adjust the queue count shortly after booting.
> >
> > I disabled that and re-ran:
> >
> > NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
> >
> > and all tests pass.
>
> Stating the obvious, perhaps, but in this case we should add the stats
> from inactive queues to the base (which when the NIC is down means all
> queues).
If I'm following that right and understanding mlx5 (two things I am
unlikely to do simultaneously), that sounds to me like:
- mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
priv->channels.params.num_channels (instead of priv->stats_nch), and when
summing mlx5e_sq_stats in the latter function, it's up to
priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
- mlx5e_get_base_stats accumulates and outputs stats for everything from
priv->channels.params.num_channels to priv->stats_nch, and
priv->channels.params.mqprio.num_tc to priv->max_opened_tc... which
should cover the inactive queues, I think.
Just writing that all out to avoid hacking up the wrong thing for the v2
and to reduce overall noise on the list :)
Powered by blists - more mailing lists