netdev - Re: [net PATCH v2 4/8] fbnic: Actually flush

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b85c0c94-6c31-4c27-b90a-0e8c540d8751@intel.com>
Date: Tue, 6 May 2025 11:52:29 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Alexander Duyck <alexander.duyck@...il.com>, <netdev@...r.kernel.org>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<horms@...nel.org>
Subject: Re: [net PATCH v2 4/8] fbnic: Actually flush_tx instead of stalling
 out



On 5/6/2025 8:59 AM, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@...com>
> 
> The fbnic_mbx_flush_tx function had a number of issues.
> 
> First, we were waiting 200ms for the firmware to process the packets. We
> can drop this to 20ms and in almost all cases this should be more than
> enough time. So by changing this we can significantly reduce shutdown time.
> 
> Second, we were not making sure that the Tx path was actually shut off. As
> such we could still have packets added while we were flushing the mailbox.
> To prevent that we can now clear the ready flag for the Tx side and it
> should stay down since the interrupt is disabled.
> 
> Third, we kept re-reading the tail due to the second issue. The tail should
> not move after we have started the flush so we can just read it once while
> we are holding the mailbox Tx lock. By doing that we are guaranteed that
> the value should be consistent.
> 
> Fourth, we were keeping a count of descriptors cleaned due to the second
> and third issues called out. That count is not a valid reason to be exiting
> the cleanup, and with the tail only being read once we shouldn't see any
> cases where the tail moves after the disable so the tracking of count can
> be dropped.
> 
> Fifth, we were using attempts * sleep time to determine how long we would
> wait in our polling loop to flush out the Tx. This can be very imprecise.
> In order to tighten up the timing we are shifting over to using a jiffies
> value of jiffies + 10 * HZ + 1 to determine the jiffies value we should
> stop polling at as this should be accurate within once sleep cycle for the
> total amount of time spent polling.
> 
> Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
> Signed-off-by: Alexander Duyck <alexanderduyck@...com>
> Reviewed-by: Simon Horman <horms@...nel.org>

Reviewed-by: Jacob Keller <jacob.e.keller@...el.com>

>  
>  	/* Give firmware time to process packet,
> -	 * we will wait up to 10 seconds which is 50 waits of 200ms.
> +	 * we will wait up to 10 seconds which is 500 waits of 20ms.
>  	 */
>  	do {
>  		u8 head = tx_mbx->head;
>  
> -		if (head == tx_mbx->tail)
> +		/* Tx ring is empty once head == tail */
> +		if (head == tail)
>  			break;
>  
> -		msleep(200);
> +		msleep(20);
>  		fbnic_mbx_process_tx_msgs(fbd);
> -
> -		count += (tx_mbx->head - head) % FBNIC_IPC_MBX_DESC_LEN;
> -	} while (count < FBNIC_IPC_MBX_DESC_LEN && --attempts);
> +	} while (time_is_after_jiffies(timeout));
>  }


This block makes me think of read_poll_timeout... but I guess that
doesn't quite fit for this implementation since you aren't just doing a
simple register read...

Thanks,
Jake