[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <076e1623-6344-470a-bdff-d8a594ce17d6@nvidia.com>
Date: Thu, 27 Nov 2025 15:53:50 +0200
From: Shahar Shitrit <shshitrit@...dia.com>
To: Paolo Abeni <pabeni@...hat.com>, Tariq Toukan <tariqt@...dia.com>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>
Cc: Jian Shen <shenjian15@...wei.com>, Salil Mehta <salil.mehta@...wei.com>,
Jijie Shao <shaojijie@...wei.com>, Saeed Mahameed <saeedm@...dia.com>,
Mark Bloch <mbloch@...dia.com>, Leon Romanovsky <leon@...nel.org>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
Gal Pressman <gal@...dia.com>, Moshe Shemesh <moshe@...dia.com>,
Yael Chemla <ychemla@...dia.com>
Subject: Re: [PATCH net-next 1/3] net: Introduce
netif_xmit_time_out_duration() helper
On 27/11/2025 12:53, Paolo Abeni wrote:
> On 11/25/25 8:12 AM, Tariq Toukan wrote:
>> From: Shahar Shitrit <shshitrit@...dia.com>
>>
>> Introduce a new helper function netif_xmit_time_out_duration() to
>> check if a TX queue has timed out and report the timeout duration.
>> This helper consolidates the logic that is duplicated in several
>> locations and also encapsulates the check for whether the TX queue
>> is stopped.
>>
>> As the first user, convert dev_watchdog() to use this helper.
>>
>> Signed-off-by: Shahar Shitrit <shshitrit@...dia.com>
>> Reviewed-by: Yael Chemla <ychemla@...dia.com>
>> Signed-off-by: Tariq Toukan <tariqt@...dia.com>
>> ---
>> include/linux/netdevice.h | 15 +++++++++++++++
>> net/sched/sch_generic.c | 7 +++----
>> 2 files changed, 18 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index e808071dbb7d..3cd73769fcfa 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3680,6 +3680,21 @@ static inline bool netif_xmit_stopped(const struct netdev_queue *dev_queue)
>> return dev_queue->state & QUEUE_STATE_ANY_XOFF;
>> }
>>
>> +static inline unsigned int
>> +netif_xmit_timeout_ms(struct netdev_queue *txq, unsigned long *trans_start)
>> +{
>> + unsigned long txq_trans_start = READ_ONCE(txq->trans_start);
>> +
>> + if (trans_start)
>> + *trans_start = txq_trans_start;
>
> What about making this argument mandatory?
Since not all callers are interested in this return value, as in the
case of mlx5, it would be nice to allow them pass NULL.>
>> +
>> + if (netif_xmit_stopped(txq) &&
>
> Why restricting to the <queue stopped> case? AFAICS the watchdog is
> intended to additionally catch the scenarios where the rx ring is not
> full but the H/W is stuck for whatever reasons, and this change will not
> catch them anymore.
>
> /P
>
dev_watchdog catches only the cases where tx queue is both full (queue
stopped) and timed out (last transmit timestamp was longer ago than the
watchdog timeout period). We wanted to preserve the same conditions.
By the way, I notice now that the new helper name in the title doesn't
match the one in the code. We will fix.
Powered by blists - more mailing lists