[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260123142136.21ddc213@kernel.org>
Date: Fri, 23 Jan 2026 14:21:36 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Mohsin Bashir <mohsin.bashr@...il.com>, netdev@...r.kernel.org,
alexanderduyck@...com, alok.a.tiwari@...cle.com, andrew+netdev@...n.ch,
andrew@...n.ch, chuck.lever@...cle.com, davem@...emloft.net,
donald.hunter@...il.com, edumazet@...gle.com, gal@...dia.com,
horms@...nel.org, idosch@...dia.com, jacob.e.keller@...el.com,
kernel-team@...a.com, kory.maincent@...tlin.com, lee@...ger.us,
pabeni@...hat.com, vadim.fedorenko@...ux.dev, kernel@...gutronix.de
Subject: Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm
On Fri, 23 Jan 2026 22:04:39 +0100 Oleksij Rempel wrote:
> In my tests, I was able to trigger an Rx stall and a pause storm (if
> flow control is enabled), for example by partially disrupting the USB
> connection. Since this controller is used in medical devices, it will
> be good to detect these anomalies and attempt recovery.
>
> Sorry, here I want to hijack this discussion for my purpose :)
>
> Since a pause storm is only a symptom of an Rx stall, should we have a
> common method to detect it? Is it even reasonably possible? In my cases,
> I tried to detect it by monitoring the level of the Rx queue, Rx HW
> counters, and Rx SW counters. But maybe I just have a blind spot and
> this is a naive way to detect things.
Not sure what USB connection disruption entails exactly but there's
presumably a lot of things which can go wrong in critical infra and
for which some daemon must periodically check and remediate.
IDK if this belongs in the kernel, but perhaps folks with more embedded
experience would find it useful.
Powered by blists - more mailing lists