lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXPiZ8H-usRn1pcD@pengutronix.de>
Date: Fri, 23 Jan 2026 22:04:39 +0100
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Mohsin Bashir <mohsin.bashr@...il.com>, netdev@...r.kernel.org,
	alexanderduyck@...com, alok.a.tiwari@...cle.com,
	andrew+netdev@...n.ch, andrew@...n.ch, chuck.lever@...cle.com,
	davem@...emloft.net, donald.hunter@...il.com, edumazet@...gle.com,
	gal@...dia.com, horms@...nel.org, idosch@...dia.com,
	jacob.e.keller@...el.com, kernel-team@...a.com,
	kory.maincent@...tlin.com, lee@...ger.us, pabeni@...hat.com,
	vadim.fedorenko@...ux.dev, kernel@...gutronix.de
Subject: Re: [PATCH net-next 0/3] net: ethtool: Track TX pause storm

On Fri, Jan 23, 2026 at 10:40:31AM -0800, Jakub Kicinski wrote:
> On Fri, 23 Jan 2026 12:28:13 +0100 Oleksij Rempel wrote:
> > Here is a TL;DR summary of my questions regarding the pause storm logic
> > :)
> 
> Eh, did you get AI to help write the full version? :) So much text :)

Yes, to reduce the text! Sorry, sometimes I have word diarrhea :D

> > - Should we standardize an "RX Watchdog" mechanism in the core instead of
> >   or in addition to driver-specific stats?
> 
> Our primary use case is machine is hard-wedged. Either Linux crash, or
> kexec died, or UEFI issue. So it must be the device that implements the
> logic.
> 
> Florian was proposing a hook to auto-disable pause from the crash
> notifier. It sounds like your use case is closer to that?

It is valid use case - it will be nice to have it too.

In my tests, I was able to trigger an Rx stall and a pause storm (if
flow control is enabled), for example by partially disrupting the USB
connection. Since this controller is used in medical devices, it will
be good to detect these anomalies and attempt recovery.

Sorry, here I want to hijack this discussion for my purpose :)

Since a pause storm is only a symptom of an Rx stall, should we have a
common method to detect it? Is it even reasonably possible? In my cases,
I tried to detect it by monitoring the level of the Rx queue, Rx HW
counters, and Rx SW counters. But maybe I just have a blind spot and
this is a naive way to detect things.

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ