lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250917055355.GA31577@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
Date: Tue, 16 Sep 2025 22:53:55 -0700
From: Erni Sri Satya Vennela <ernis@...ux.microsoft.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
	decui@...rosoft.com, andrew+netdev@...n.ch, davem@...emloft.net,
	edumazet@...gle.com, kuba@...nel.org, longli@...rosoft.com,
	kotaranov@...rosoft.com, horms@...nel.org,
	shradhagupta@...ux.microsoft.com, dipayanroy@...ux.microsoft.com,
	shirazsaleem@...rosoft.com, rosenp@...il.com,
	linux-hyperv@...r.kernel.org, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH 2/2] net: mana: Add standard counter rx_missed_errors

On Tue, Sep 16, 2025 at 03:22:54PM +0200, Paolo Abeni wrote:
> On 9/15/25 5:58 AM, Erni Sri Satya Vennela wrote:
> > Report standard counter stats->rx_missed_errors
> > using hc_rx_discards_no_wqe from the hardware.
> > 
> > Add a dedicated workqueue to periodically run
> > mana_query_gf_stats every 2 seconds to get the latest
> > info in eth_stats and define a driver capability flag
> > to notify hardware of the periodic queries.
> > 
> > To avoid repeated failures and log flooding, the workqueue
> > is not rescheduled if mana_query_gf_stats fails.
> 
> Can the failure root cause be a "transient" one? If so, this looks like
> a dangerous strategy; is such scenario, AFAICS, stats will be broken
> until the device is removed and re-probed.
> 
We are working on using the stats query as a health check for the
hardware and its channel. Even if it fails once, the VF needs to
be reset, similar to a probe. The hardware team also confirmed that even
a one-time or temporary failure needs a VF reset.

- Vennela

> /P

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ