[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32495a72-4d41-4b72-84e7-0d86badfd316@gmail.com>
Date: Thu, 9 May 2024 13:16:15 +0300
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Joe Damato <jdamato@...tly.com>, Jakub Kicinski <kuba@...nel.org>
Cc: Tariq Toukan <ttoukan.linux@...il.com>, Zhu Yanjun
<zyjzyj2000@...il.com>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, saeedm@...dia.com, gal@...dia.com,
nalramli@...tly.com, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Leon Romanovsky <leon@...nel.org>,
"open list:MELLANOX MLX5 core VPI driver" <linux-rdma@...r.kernel.org>,
Paolo Abeni <pabeni@...hat.com>, Tariq Toukan <tariqt@...dia.com>
Subject: Re: [PATCH net-next 0/1] mlx5: Add netdev-genl queue stats
On 09/05/2024 9:30, Joe Damato wrote:
> On Wed, May 08, 2024 at 07:08:39PM -0700, Jakub Kicinski wrote:
>> On Thu, 9 May 2024 01:57:52 +0000 Joe Damato wrote:
>>> If I'm following that right and understanding mlx5 (two things I am
>>> unlikely to do simultaneously), that sounds to me like:
>>>
>>> - mlx5e_get_queue_stats_rx and mlx5e_get_queue_stats_tx check if i <
>>> priv->channels.params.num_channels (instead of priv->stats_nch),
>>
>> Yes, tho, not sure whether the "if i < ...num_channels" is even
>> necessary, as core already checks against real_num_rx_queues.
>>
>>> and when
>>> summing mlx5e_sq_stats in the latter function, it's up to
>>> priv->channels.params.mqprio.num_tc instead of priv->max_opened_tc.
>>>
>>> - mlx5e_get_base_stats accumulates and outputs stats for everything from
>>> priv->channels.params.num_channels to priv->stats_nch, and
>>
>> I'm not sure num_channels gets set to 0 when device is down so possibly
>> from "0 if down else ...num_channels" to stats_nch.
>
> Yea, you were right:
>
> if (priv->channels.num == 0)
> i = 0;
> else
> i = priv->channels.params.num_channels;
>
> for (; i < priv->stats_nch; i++) {
>
> Seems to be working now when I adjust the queue count and the test is
> passing as I adjust the queue count up or down. Cool.
>
I agree that get_base should include all inactive queues stats.
But it's not straight forward to implement.
A few guiding points:
Use mlx5e_get_dcb_num_tc(params) for current num_tc.
txq_ix (within the real_num_tx_queues) is calculated by c->ix + tc *
params->num_channels.
The txqsq stats struct is chosen by channel_stats[c->ix]->sq[tc].
It means, in the base stats you should include SQ stats for:
1. all SQs of non-active channels, i.e. ch in [params.num_channels,
priv->stats_nch), tc in [0, priv->max_opened_tc).
2. all SQs of non-active TCs in active channels [0,
params.num_channels), tc in [mlx5e_get_dcb_num_tc(params),
priv->max_opened_tc).
Now I actually see that the patch has issues in mlx5e_get_queue_stats_tx.
You should not loop over all TCs of channel index i.
You must do a reverse mapping from "i" to the pair/tuple [ch_ix, tc],
and then access a single TXQ stats by priv->channel_stats[ch_ix].sq[tc].
> Adding TCs to the NIC triggers the test to fail, so there's still some bug
> in how I'm accumulating stats from the hw TCs.
Powered by blists - more mailing lists