[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZplpKq8FKi3vwfxv@gmail.com>
Date: Thu, 18 Jul 2024 12:12:42 -0700
From: Breno Leitao <leitao@...ian.org>
To: Dragos Tatulea <dtatulea@...dia.com>
Cc: Tariq Toukan <tariqt@...dia.com>, Saeed Mahameed <saeedm@...dia.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: mlx5e warnings on 6.10
On Thu, Jul 18, 2024 at 11:00:00AM +0000, Dragos Tatulea wrote:
> Hi Breno,
>
> On Wed, 2024-07-17 at 04:41 -0700, Breno Leitao wrote:
> > Sharing in case you find it useful.
> Thanks for the report. The output, it is very useful. The problem seems to be
> that mlx5e_tx_reporter_timeout_recover() should take a state lock and doesn't.
Right. I've looked at other cases where mlx5e_safe_reopen_channels() is
called, and priv->state_lock is, in fact, hold before calling it.
So, independent if this fix the problem or not, it seems the right thing
to do.
Feel free to add a "Reviewed-by: Breno Leitao <leitao@...ian.org>" when
you send it.
> I wonder why this happened only in 6.10. There were no relevant changes in 6.10.
> Or is it maybe that until now you didn't run into the tx queue timeout issue?
I don't have a reproducer for it, so, i just got it in 6.10. Maybe just
a coincidence?
> Would you have the possibility and willingness to test the below fix?
Sure. I have two hosts running with your patch, but, it is hard to make
them timeout.
Let me know if you have any trick I can explore and force the card to
time out.
Thanks for the quick reply!
--breno
Powered by blists - more mailing lists