lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zj1Y14MeqWWGFwP_@LQ3V64L9R2>
Date: Thu, 9 May 2024 16:14:31 -0700
From: Joe Damato <jdamato@...tly.com>
To: Tariq Toukan <ttoukan.linux@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Zhu Yanjun <zyjzyj2000@...il.com>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	saeedm@...dia.com, gal@...dia.com, nalramli@...tly.com,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Leon Romanovsky <leon@...nel.org>,
	"open list:MELLANOX MLX5 core VPI driver" <linux-rdma@...r.kernel.org>,
	Paolo Abeni <pabeni@...hat.com>, Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats

On Thu, May 09, 2024 at 01:16:15PM +0300, Tariq Toukan wrote:
> 
> 
> On 09/05/2024 9:30, Joe Damato wrote:
> > On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
> > > On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
> > > > If I'm following that right and understanding mlx5 (two things I am
> > > > unlikely to do simultaneously), that sounds to me like:
> > > > 
> > > > - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
> > > >    priv->channels.params.num_channels (instead of priv->stats_nch),
> > > 
> > > Yes, tho, not sure whether the "if i < ...num_channels" is even
> > > necessary, as core already checks against real_num_rx_queues.
> > > 
> > > >    and when
> > > >    summing mlx5e_sq_stats in the latter function, it's up to
> > > >    priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
> > > > 
> > > > - mlx5e_get_base_stats accumulates and outputs stats for everything from
> > > >    priv->channels.params.num_channels to priv->stats_nch, and
> > > 
> > > I'm not sure num_channels gets set to 0 when device is down so possibly
> > > from "0 if down else ...num_channels" to stats_nch.
> > 
> > Yea, you were right:
> > 
> >    if (priv->channels.num == 0)
> >            i = 0;
> >    else
> >            i = priv->channels.params.num_channels;
> >    for (; i < priv->stats_nch; i++) {
> > 
> > Seems to be working now when I adjust the queue count and the test is
> > passing as I adjust the queue count up or down. Cool.
> > 
> 
> I agree that get_base should include all inactive queues stats.
> But it's not straight forward to implement.
> 
> A few guiding points:

Thanks for the guiding points - it is very helpful.

> Use mlx5e_get_dcb_num_tc(params) for current num_tc.
> 
> txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
> params->num_channels.
> 
> The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
> 
> It means, in the base stats you should include SQ stats for:
> 1. all SQs of non-active channels, i.e. ch in [params.num_channels,
> priv->stats_nch), tc in [0, priv->max_opened_tc).
> 2. all SQs of non-active TCs in active channels [0, params.num_channels), tc
> in [mlx5e_get_dcb_num_tc(params), priv->max_opened_tc).

Thanks yea this is what I was working on last night -- I realized that I
need to include the non-active TCs on the active channels, too and have
some code that does that.

I'm still off slightly, but am giving it another look now.

> Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
> You should not loop over all TCs of channel index i.
> You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc], and
> then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].

OK, thanks for explaining that, I'll take a closer look at this as well. 

> > Adding TCs to the NIC triggers the test to fail, so there's still some bug
> > in how I'm accumulating stats from the hw TCs.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ