lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20240414102337.GA645060@kernel.org>
Date: Sun, 14 Apr 2024 11:23:37 +0100
From: Simon Horman <horms@...nel.org>
To: Nick Child <nnac123@...ux.ibm.com>
Cc: netdev@...r.kernel.org, haren@...ux.ibm.com, ricklind@...ibm.com,
	mmc@...ux.ibm.com
Subject: Re: [PATCH net-next] ibmvnic: Return error code on TX scrq flush fail

On Thu, Apr 11, 2024 at 03:34:35PM -0500, Nick Child wrote:
> In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then
> it will inform upper level networking functions to disable tx
> queues. H_CLOSED signals that the connection with the vnic server is
> down and a transport event is expected to recover the device.
> 
> Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success.
> Therefore, the queues would remain active until ibmvnic_cleanup() is
> called within do_reset().
> 
> The problem is that do_reset() depends on the RTNL lock. If several
> ibmvnic devices are resetting then there can be a long wait time until
> the last device can grab the lock. During this time the tx/rx queues
> still appear active to upper level functions.
> 
> FYI, we do make a call to netif_carrier_off() outside the RTNL lock but
> its calls to dev_deactivate() are also dependent on the RTNL lock.
> 
> As a result, large amounts of retransmissions were observed in a short
> period of time, eventually leading to ETIMEOUT. This was specifically
> seen with HNV devices, likely because of even more RTNL dependencies.
> 
> Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is
> propagated to the xmit function to allow for an earlier (and lock-less)
> response to a transport event.
> 
> Signed-off-by: Nick Child <nnac123@...ux.ibm.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
> index 30c47b8470ad..f5177f370354 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2371,7 +2371,7 @@ static int ibmvnic_tx_scrq_flush(struct ibmvnic_adapter *adapter,
>  		ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
>  	else
>  		ind_bufp->index = 0;
> -	return 0;
> +	return rc;
>  }
>  
>  static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)

Hi Nick,

I notice that some, but not all, cases the return value of
ibmvnic_tx_scrq_flush() is not checked. Should that also be
addressed?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ