lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251126200541.00e5270f@kernel.org>
Date: Wed, 26 Nov 2025 20:05:41 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Dipayaan Roy <dipayanroy@...ux.microsoft.com>
Cc: kys@...rosoft.com, haiyangz@...rosoft.com, wei.liu@...nel.org,
 decui@...rosoft.com, andrew+netdev@...n.ch, davem@...emloft.net,
 edumazet@...gle.com, pabeni@...hat.com, longli@...rosoft.com,
 kotaranov@...rosoft.com, horms@...nel.org,
 shradhagupta@...ux.microsoft.com, ssengar@...ux.microsoft.com,
 ernis@...ux.microsoft.com, shirazsaleem@...rosoft.com,
 linux-hyperv@...r.kernel.org, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-rdma@...r.kernel.org,
 dipayanroy@...rosoft.com
Subject: Re: [PATCH net-next, v4] net: mana: Implement ndo_tx_timeout and
 serialize queue resets per port.

On Sun, 23 Nov 2025 10:08:18 -0800 Dipayaan Roy wrote:
> Implement .ndo_tx_timeout for MANA so any stalled TX queue can be detected
> and a device-controlled port reset for all queues can be scheduled to a
> ordered workqueue. The reset for all queues on stall detection is
> recomended by hardware team.
> 
> The change introduces a single ordered workqueue
> "mana_per_port_queue_reset_wq" queuing one work_struct per port,
> using WQ_UNBOUND | WQ_MEM_RECLAIM so stalled queue reset work can
> run on any CPU and still make forward progress under memory
> pressure.

And we need to be able to reset the NIC queue under memory pressure
because.. ?  I could be wrong but I still find this unusual / defensive
programming, if you could point me at some existing drivers that'd help.

> @@ -3287,6 +3341,7 @@ static int mana_probe_port(struct mana_context *ac, int port_idx,
>  	ndev->min_mtu = ETH_MIN_MTU;
>  	ndev->needed_headroom = MANA_HEADROOM;
>  	ndev->dev_port = port_idx;
> +	ndev->watchdog_timeo = 15 * HZ;

5 sec is typical, off the top of my head

> @@ -3647,6 +3717,11 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
>  		free_netdev(ndev);
>  	}
>  
> +	if (ac->per_port_queue_reset_wq) {
> +		destroy_workqueue(ac->per_port_queue_reset_wq);
> +		ac->per_port_queue_reset_wq = NULL;
> +	}

I think you're missing this cleanup in the failure path of mana_probe
-- 
pw-bot: cr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ