netdev - Re: [PATCH V4 net] net: mana: Fix MANA VF unload when host is unresponsive

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <83ef6401-8736-8416-c898-2fbbb786726e@intel.com>
Date: Mon, 3 Jul 2023 18:47:49 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: souradeep chakrabarti <schakrabarti@...ux.microsoft.com>
CC: <kys@...rosoft.com>, <haiyangz@...rosoft.com>, <wei.liu@...nel.org>,
	<decui@...rosoft.com>, <davem@...emloft.net>, <edumazet@...gle.com>,
	<kuba@...nel.org>, <pabeni@...hat.com>, <longli@...rosoft.com>,
	<sharmaajay@...rosoft.com>, <leon@...nel.org>, <cai.huoqing@...ux.dev>,
	<ssengar@...ux.microsoft.com>, <vkuznets@...hat.com>, <tglx@...utronix.de>,
	<linux-hyperv@...r.kernel.org>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linux-rdma@...r.kernel.org>,
	<stable@...r.kernel.org>, <schakrabarti@...rosoft.com>
Subject: Re: [PATCH V4 net] net: mana: Fix MANA VF unload when host is
 unresponsive

From: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>
Date: Mon,  3 Jul 2023 01:49:31 -0700

> From: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>

Please sync your Git name and Git mail account settings, so that your
own patches won't have "From:" when sending. From what I see, you need
to correct first letters of name and surname to capital in the Git email
settings block.

> 
> When unloading the MANA driver, mana_dealloc_queues() waits for the MANA
> hardware to complete any inflight packets and set the pending send count
> to zero. But if the hardware has failed, mana_dealloc_queues()
> could wait forever.
> 
> Fix this by adding a timeout to the wait. Set the timeout to 120 seconds,
> which is a somewhat arbitrary value that is more than long enough for
> functional hardware to complete any sends.
> 
> Signed-off-by: Souradeep Chakrabarti <schakrabarti@...ux.microsoft.com>

Where's "Fixes:" tagging the blamed commit?

> ---
> V3 -> V4:
> * Fixed the commit message to describe the context.
> * Removed the vf_unload_timeout, as it is not required.
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 26 ++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index a499e460594b..d26f1da70411 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2346,7 +2346,10 @@ static int mana_dealloc_queues(struct net_device *ndev)
>  {
>  	struct mana_port_context *apc = netdev_priv(ndev);
>  	struct gdma_dev *gd = apc->ac->gdma_dev;
> +	unsigned long timeout;
>  	struct mana_txq *txq;
> +	struct sk_buff *skb;
> +	struct mana_cq *cq;
>  	int i, err;
>  
>  	if (apc->port_is_up)
> @@ -2363,15 +2366,32 @@ static int mana_dealloc_queues(struct net_device *ndev)
>  	 * to false, but it doesn't matter since mana_start_xmit() drops any
>  	 * new packets due to apc->port_is_up being false.
>  	 *
> -	 * Drain all the in-flight TX packets
> +	 * Drain all the in-flight TX packets.
> +	 * A timeout of 120 seconds for all the queues is used.
> +	 * This will break the while loop when h/w is not responding.
> +	 * This value of 120 has been decided here considering max
> +	 * number of queues.
>  	 */
> +
> +	timeout = jiffies + 120 * HZ;

Why not initialize it right when declaring?

>  	for (i = 0; i < apc->num_queues; i++) {
>  		txq = &apc->tx_qp[i].txq;
> -
> -		while (atomic_read(&txq->pending_sends) > 0)
> +		while (atomic_read(&txq->pending_sends) > 0 &&
> +		       time_before(jiffies, timeout)) {
>  			usleep_range(1000, 2000);> +		}
>  	}

120 seconds by 2 msec step is 60000 iterations, by 1 msec is 120000
iterations. I know usleep_range() often is much less precise, but still.
Do you really need that much time? Has this been measured during the
tests that it can take up to 120 seconds or is it just some random value
that "should be enough"?
If you really need 120 seconds, I'd suggest using a timer / delayed work
instead of wasting resources.

>  
> +	for (i = 0; i < apc->num_queues; i++) {
> +		txq = &apc->tx_qp[i].txq;
> +		cq = &apc->tx_qp[i].tx_cq;

cq can be just &txq->tx_cq.

> +		while (atomic_read(&txq->pending_sends)) {
> +			skb = skb_dequeue(&txq->pending_skbs);
> +			mana_unmap_skb(skb, apc);
> +			napi_consume_skb(skb, cq->budget);

(you already have comment about this one)

> +			atomic_sub(1, &txq->pending_sends);
> +		}
> +	}
>  	/* We're 100% sure the queues can no longer be woken up, because
>  	 * we're sure now mana_poll_tx_cq() can't be running.
>  	 */

Thanks,
Olek